A Dict-like wrapper for HDF5/JLD files



Build Status Build Status

DictFiles provides an easy to use abstraction over the excellent JLD and HDF5 packages by Tim Holy. A DictFile is a standard JLD file which behaves similar to nested Dict's:

using DictFiles
dictopen("/tmp/test") do a
    a["key1"] = [1 2 3]
    a["key1"]            # == [1 2 3]
    a[]                  # == Dict("key1"=>[1 2 3])

    a["key2",1] = "One"
    a["key2","two"] = 2
    a["key2"]            # == Dict(1 => "One", "two" => 2)
    a["key2", 1]         # == "One"

    a[:mykey] = Dict("item" => 2.2)
    a[:mykey,"item"]    # == 2.2

It provides additional features for memory-mapping individual entries and compacting of the file to reclaim space lost through deletions / updates.


Simply add the package using Julia's package manager once:


You also need the master branch of HDF5.jl:


Then include it where you need it:

using DictFiles


DictFiles behave like nested Dicts. The primary way to assess a DictFile df is using df[keys...] = value and df[keys...], where keys is a tuple of primitive types, i.e. strings, chars, numbers, tuples, small arrays.

DictFile, dictopen, close

a = DictFile("/tmp/test")
# do something with a

A better way do to this, in case an error occurs, is:

dictopen("/tmp/test") do a
    # do something with a

Like open, both methods take a mode parameter, with the default being r+, with the added behavior for r+ that the file is created when it does not exist yet.

dictread, dictwrite

To read the entire contents of a file:

r = dictread(filename)

To overwrite the entire contents of a dictfile with a Dict:

dictwrite(somedict, filename)

Setting and getting, browsing, deleting

dictopen("/tmp/test") do a
    a["mykey"] = 1
    a["mykey"]                #  returns 1

    # following the metaphor of nested Dict's:
    a[] = Dict("mykey" => 1, "another key" => Dict("a"=>"A", :b =>"B", 1=>'c'))
    a[]                       # gets the entire contents as one Dict()

    a["another key", :b]      #  "B"
    a["another key"]          #  Dict("a"=>"A", :b =>"B", 1=>'c')

    keys(a)                   #  Dict("another key","mykey") 
    keys(a,"another key")     #  Dict("a",1,:b) 
    values(a)                 #  [Dict(:b=>"B",1=>'c',"a"=>"A"),1] 
    values(a,"another key")   #  Dict("A",'c',"B") 
    haskey(a,"mykey") ? println("has key!") : nothing

    # note that the default parameter for get comes second! 
    get(a, "default", "mykey")   #  1 
    delete!(a, "mykey")
    get(a, "default", "mykey")   #  "default"

In case you have a very nested data structure in your file and want to only work on a part of it:

dictopen("/tmp/test") do a 
    a[] = Dict("some"=>1, "nested data" => Dict("a" => 1, "b" => 2))
    b = DictFile(a, "nested data")   #  e.g., you can pass b to other functions
    keys(b)                          #  {"a","b"] 
    b["c"] = 3 
    a[]                              #  Dict("some"=>1, 
                                     #  "nested data" => Dict("a" => 1, "b" => 2, "c" => 3))


When fields get overwritten or explicitly deleted, HDF5 appends the new data to the file und unlinks the old data. The space of the original data is not recovered. For this, you can compact the file from time to time. This copies all data to a temporary file and replaces the original on success.



I'd be very grateful for bug reports und feature suggestions - please file an issue!

First Commit


Last Touched

over 3 years ago


40 commits

Used By: