PersistentDict leaves empty key/value files in its wake if pickling raises an exception

This is deliberate (at least, for WriteOncePersistentDict).

Without this behavior, we may encounter subtle race conditions, such as the following:

Process W locks the directory entry for writing
Process R wants to read the entry and spins on the lock
Process W fails to write to the directory entry and so deletes everything
Process R sees that the lock has been deleted, so starts to read
Process W', in the meantime, wants to write to the directory and acquires the lock.
Process R reads from the entry at the same time that Process W' writes to it.

I wanted to be careful and avoid race conditions like the above one, especially since the reader process doesn't acquire any locks.

In the same way, I am not 100% sure it is safe for the reader to delete the directory if the reader does not acquire any locks.

OK, I see.

There is a partial solution that avoids the issue: Pickle to memory, then write. I imagine "streaming pickle" is a thing and cheaper in terms of memory use, but I'm happy to pay for improved user experience.

A separate issue is that I don't think the dict is currently able to recover from an invalid key file even on a renewed write (if I'm reading the code correctly).

There is a partial solution that avoids the issue: Pickle to memory, then write. I imagine "streaming pickle" is a thing and cheaper in terms of memory use, but I'm happy to pay for improved user experience.

I think it would also be safe for the writer to delete the bad key file, as long as the writer doesn't delete the lock. That way, there will still be lock warnings but there won't be any invalid key file warnings.

A separate issue is that I don't think the dict is currently able to recover from an invalid key file even on a renewed write (if I'm reading the code correctly).

I'm not sure what you mean.

If the dir exists (which it will if there's a broken key file in it), I believe we'll end up in this code path. (coming from store_if_not_present as most of our writes do)

(I would rate being able to recover from a bad key file as more important than not writing bad key files in the first place.)

You are right. It is not possible to overwrite a bad key file without a race condition. Unfortunately, I think that is inherent to the locking algorithm. That was by design (hence being write-once).

I have been thinking about how to get rid of locking altogether. One possibility is to use the fact that rename() is atomic, according to POSIX. So, the writer can write everything to a temporary file and then it will call rename() on it to make it visible to the readers.

Not sure how well this works on NFS or Windows, however.

Not sure how well this works on NFS or Windows, however.

I'm OK with being correct according to POSIX and only that. For NFS, users learn quickly to put caches onto (local-only) scratch partitions.

It is not possible to overwrite a bad key file without a race condition. Unfortunately, I think that is inherent to the locking algorithm. That was by design (hence being write-once).

Even while holding the lock?

Even while holding the lock?

The first time the lock is acquired, it is safe. The second and subsequent times, it is not safe. That is because the second time the lock is acquired, there could be active processes reading the directory, since readers do not acquire a lock.

Just so we are on the same page, what do you think should happen in the following two situations?

a reader loads an invalid key/value file
a writer calls store_if_not_present(), but the key/value file is invalid?

Should both of these result in removing the key/value file? In that case, would the writer have to load the key/value file beforehand to determine if it is valid?

(Both of these are IMO--up for negotiation if you disagree)

a reader loads an invalid key/value file

This should result in the same behavior that would occur as if there was no entry there. If we expect invalid key files to arise under "normal operating conditions" (which would include crashing pickles), then the 'invalid key file' warning should be demoted to DEBUG.

a writer calls store_if_not_present(), but the key/value file is invalid?

I would like it if this could result in the on-disk data recovering to a valid state, with the new valid entry persisted.

In that case, would the writer have to load the key/value file beforehand to determine if it is valid?

I don't think there's a way around that. An invalid key file should be the equivalent of "there's nothing there".

I know it's been a while, but I've been thinking about this recently.

I don't agree that an invalid key or value file should be ignored silently. There is at least a possibility that the reason the file is invalid is because although it pickled correctly, the unpickle implementation is broken. I think deleting it should at least result in a warning.

Also, I think requiring WriteOncePersistentDict to autodelete invalid entries goes against it being write-once. If we require some sort of recovery from bad pickles, then I think a different solution is needed. In my opinion, the right thing to do would be to try to create a version of PersistentDict that uses an in-memory cache. To maintain cache consistency, this will require some sort of versioning scheme. Ideally, I think the versioning scheme shouldn't have to do exclusive file system operations for reads. Potentially, we could use an sqlite database or use POSIX advisory locking.

From a correctness perspective, you've convinced me. We shouldn't ignore invalid files.

My concern comes from a practical angle: Suppose something crashes and a user ends up with an invalid state. As long as there is a way to recover, I'm happy. This could be as simple as complaining with an exception (not just a warning) including a recommendation to rm -Rf THIS_OR_THAT to recover.

If we require some sort of recovery from bad pickles, then I think a different solution is needed

Out-of-band recovery (as I suggest above) is fine IMO--no need to overengineer this.

Okay - it is not too far from what we have to make it complain to the user with an exception if there is an invalid file, with directions for how to recover.

I also think we should adopt the "pickle to memory before writing anything" approach to ensure robustness against bad pickle implementations.

A related question: fsync() or not? Currently, we don't fsync after writing. I think the only added benefit of this would be to ensure more robustness against power failures. The downside would be longer wait times for writing.

A related question: fsync() or not? Currently, we don't fsync after writing. I think the only added benefit of this would be to ensure more robustness against power failures. The downside would be longer wait times for writing.

I honestly don't know. I think of writes to these caches as relatively infrequent, so it may be the case that adding the fsync does not cost us much. At the same time, I don't particularly care whether the cache is robust to power failures. If it's screwed up, you delete it and wait for a bit. It's not exactly millions of dollars worth of customer invoices--so the design point is quite different than, say, SQLite.

PersistentDict leaves empty key/value files in its wake if pickling raises an exception

Child items ...

Activity