connection failed between gateway and (S4) server #2483

Closed
opened 2015-08-18 17:15:22 +00:00 by zooko · 3 comments

I was just trying to upload a file and got a failure. Here is some diagnostic info about it...

I was just trying to upload a file and got a failure. Here is some diagnostic info about it...
zooko added the
code-network
normal
defect
1.10.1
labels 2015-08-18 17:15:22 +00:00
zooko added this to the undecided milestone 2015-08-18 17:15:22 +00:00
Author

Attachment incident-2015-08-18--17-04-19Z-rfjayvy.flog.bz2 (9636 bytes) added

I hit the "report an incident" button after this failure.

**Attachment** incident-2015-08-18--17-04-19Z-rfjayvy.flog.bz2 (9636 bytes) added I hit the "report an incident" button after this failure.
daira commented 2015-08-18 20:18:08 +00:00
Owner

Zooko and I successfully reproduced this failure, and a similar failure for SDMF. For MDMF, the storage server is consistently OOM-killed when it approaches the 595 MB available memory of a t1.micro EC2 instance (at around 543MB resident):

Aug 18 19:36:20 ip-10-196-59-108 kernel: [33065383.602943]
 Out of memory: Kill process 1662 (tahoe) score 892 or sacrifice child
Aug 18 19:36:20 ip-10-196-59-108 kernel: [33065383.602960]
 Killed process 1662 (tahoe) total-vm:577324kB, anon-rss:542684kB, file-rss:0kB

For SDMF, the publish fails due to a Python MemoryError at around 442MB resident. The process survives but stays at the same memory usage for a while, then drops to 281MB resident.

More details to follow.

Zooko and I successfully reproduced this failure, and a similar failure for SDMF. For MDMF, the storage server is consistently OOM-killed when it approaches the 595 MB available memory of a t1.micro EC2 instance (at around 543MB resident): ``` Aug 18 19:36:20 ip-10-196-59-108 kernel: [33065383.602943] Out of memory: Kill process 1662 (tahoe) score 892 or sacrifice child Aug 18 19:36:20 ip-10-196-59-108 kernel: [33065383.602960] Killed process 1662 (tahoe) total-vm:577324kB, anon-rss:542684kB, file-rss:0kB ``` For SDMF, the publish fails due to a Python `MemoryError` at around 442MB resident. The process survives but stays at the same memory usage for a while, then drops to 281MB resident. More details to follow.

This is probably the wrong place for S4 bug reports. Also the S4 deployment infrastructure has changed such that the limits discussed above no longer exist.

This is probably the wrong place for S4 bug reports. Also the S4 deployment infrastructure has changed such that the limits discussed above no longer exist.
exarkun added the
was already fixed
label 2020-01-17 16:20:12 +00:00
Sign in to join this conversation.
No Milestone
No Assignees
3 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Reference: tahoe-lafs/trac-2024-07-25#2483
No description provided.