Name: Making Swift More Robust to Handling Failure
Start: 2013-11-07T15:30:00+0800
End: 2013-11-07T16:10:00+0800

Please note: This schedule is for OpenStack Active Technical Contributors participating in the Icehouse Design Summit sessions in Hong Kong. These are working sessions to determine the roadmap of the Icehouse release and make decisions across the project. To see the full OpenStack Summit schedule, including presentations, panels and workshops, go to http://openstacksummitnovember2013.sched.org.

Back To Schedule

Making Swift More Robust to Handling Failure

I would like to take some time to focus on some areas that could make swift more robust to failure scenarios. This will may include discussions about (but not limited to):

1. Better error limiting. The error limiting code has gotten a bit stale, and could use an audit and cleaning up. On top of that, the level of audit is at the worker level, so if you have a machine with many workers, it can take a while before a node gets completely error limited. It might be useful to have a local cache that is shared across the workers.

2. Early return on writes. Currently, the elapsed time for a write will be the slowest of the 3 replica writes. In the case that you have a badly behaving node, can cause a lot of issues. We should be able to return to the user as soon as 2 replicas have been successfully written

3. Async fsync. It might be useful to have an optional setting that would allow an object server to return immediately upon the completion of the write, and issue the fsync asynchronously. This of course comes at a risk, but I would like to discuss ways to possibly mitigate this.

There are other smaller things as well, and I would be curious to hear other ideas how we can make swift more robust.

(Session proposed by creiht)

Thursday November 7, 2013 3:30pm - 4:10pm HKT
AWE Level 2, Room 201C

Swift

Icehouse Design Summit

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Attendees (0)