How we designed Mark for Offline in Box Drive

Without doubt, the most highly anticipated feature of Box Drive is the ability to mark files for offline access (MFO). In practical terms it brings the syncing ability of Box Sync to Box Drive enabling you to work on files offline and send and receive updates to the files when online. This feature sounds simple, but as we dug into it, we came across a lot of decision points about how the feature will work. Here we'll explain our reasoning for why we made the decisions we did. It's important to note that other products have reached different conclusions from ours and may behave differently! We'd actually be happy if competing products decided to work the same way because it means less confusion for everyone. But until that day happens the best we can do is explain our design decisions about how MFO works in Box Drive.

As background, using Box Drive, users can navigate through their entire Box tree, opening any file they have access to. Files are downloaded as needed and opened locally. They are cached (up to a certain disk limit) so if users repeatedly open the same files they don't have to be downloaded again. If a file changes on Box, Box Drive can simply remove it from the cache. The next time the user opens it, they will download the latest version to their cache and open that. When the cache becomes full, the least recently used files are removed to make room for new files.

When a folder is marked for offline (at this time the ability is reserved for a whole folder, not individual files) the content is downloaded in the background and is stored locally. MFO content is never removed to make more room in the cache. Of course, you could decide to MFO a very large folder and fill up your hard disk! All is not lost, you can change your mind and set the folder back to online-only or just set some of the sub-folders back on online-only in order to reduce the amount of content marked for offline. As shorthand, around the office this "undoing the MFO to make something online-only" became known as unMFO. In the following diagram we see a folder hierarchy where the user made A marked for offline, which marks all folders below as MFO, then reset X and Y back to online-only (MFO A and unMFO X and Y for short).

If a new folder D was created in B it would be automatically MFO. If a new folder Z was created in X it would be unMFO. That seems obvious. What about a new folder M created in A? Because its parent A is MFO, it would be MFO also. Even if B and C were first unmarked for offline, new content in A is marked for offline. This is one of the design decisions the Box Drive team had to make - should we consider the likelihood that you would want the folder MFO or not based on its peers? We decided to only use the parent folder to decide. Even if every folder under A was unmarked for offline, we make the new folder MFO because the parent folder is MFO.

Box is all about sharing and collaboration. What about folders and files created by colleagues? If a colleague uploads some content (folder E) into folder C, will that be Marked for Offline or not? We decided to not distinguish between changes made to the folder tree by different users. In a few weeks, will people remember who created folder E? It really shouldn't matter if we're all on the same team. So whether you create folder E locally, or a colleague creates E on Box, it will become MFO, because the parent folder is MFO. Where an action comes from doesn't matter. This is true for all actions including moving content.

Let's looks deeper at moves. It's clear that when one moves MFO folders among folders that are also MFO, the folder should stay MFO and likewise moving unMFO folders among folders that are unMFO they should stay unMFO, but what about moving MFO to unMFO or unMFO to MFO? On the above diagram it would be moving a folder from the left to right side or from the right to the left side. Here we considered the wishes of our customers and circumstances over "mathematically purity." If you are offline and something you previously marked for offline is missing, you're going to be upset. This could happen if a colleague moved a folder to a new location (perhaps in error). So while moving a folder into a MFO folder makes it MFO, moving a folder into an unMFO folder does NOT make it unMFO. This can be seen in the following diagram where we moved two folders to new locations. While folder Z became MFO, folder E stayed MFO.

Content that moves into an MFO folder becomes MFO, but once it becomes MFO it stays MFO until you unMFO it. In the diagram we can see that folder Z moved under C and became MFO. Folder E was moved to Y but it still maintains the MFO state. What we wanted was to be able to guarantee that once you MFO some content, you will have access to it. We did not want to say was something like "Sure your MFO content will be available when you're offline... unless somebody else accidentally moves it to an unMFO folder." So even though the diagram might look prettier if we were to unMFO folder E, we don't do that. Another thought was that while folder A was explicitly marked MFO, folders below it got marked MFO automatically. Maybe the user doesn't even care about folder E or folder D. We could have had some hidden attribute like A is the MFO root folder and B, C, M, E, etc. just inherit their state from their parent. That would allow E to adapt to wherever it was moved. We decided while this might seem elegant, there wasn't a good way to show the differences and it ends up being confusing with multiple behaviors, plus you have to explicitly mark all the folders you want to protect from getting unMFO'ed accidentally. You will likely only discover this when you're offline and the content you need is gone. To avoid this and keep things simple, A folder is either MFO or unMFO, there is nothing else. In the diagram folder C and E have the same MFO bit as folder A or folder B.

We see that with MFO and unMFO we can start to get alternating colors of folders. Around the office, these became known as "zebras." The folders A/Y/E are a zebra for example. How to handle these animals is an interesting question. For example, if I unMFO a zebra, should that unMFO the whole thing or just the top-most "stripe." In the diagram below, the user unMFO's folder A, but buried inside it is a folder E that is MFO and since it's under an unMFO folder, marking it unMFO would "lose" the work the user did in marking it specifically. On the other hand, the user unMFO'ed folder A, telling us they are not interested in this entire folder. It's not a great experience to have to dig around in all the folders and subfolders to unMFO a whole tree. Between the two operations we decided it was a better user experience to apply MFO operations to the folder and the entire tree below. In this case we eliminate the zebra because we applied a change to the MFO state.

But, zebras are frequently preserved. Most of the time moving a folder does not involve a change in the MFO state. When a folder is moved that doesn't affect its MFO state it doesn't result in a change to any folder MFO state in the tree.

Notice that folder W in the diagram did not change state when we moved folder E. This is because folder E did not change state so the "zebra" was preserved.

Remember, there is one case where moving a folder results in a change of state. This is the case seen earlier where content moved into a MFO folder becomes MFO. Should we just mark the top-most folder, the top "stripe," or the whole tree? We decided to do the whole tree, mimicking what would happen when the user marks the folder. Changing the MFO state of a folder affects its entire subtree.

In the diagram we see that moving folder X results in a change of state to X and we then apply that to the entire tree below. If that tree was a zebra, it won't be any longer.

I hope this helps explain both how the MFO feature works and how we came to the decisions we made. If you found this useful, please tweet @BoxEng. In the next posting I can explain icon overlays and our decisions about when something is in-progress or an error.