IndexedTables provides tabular data structures where some of the columns form a sorted index. It provides the backend to JuliaDB, but can be used on its own for efficient in-memory data processing and analytics.
IndexedTables offers two data structures: IndexedTable
and NDSparse
.
IndexedTable
and NDSparse
differ mainly in how data is accessed.select
, filter
, etc.). using Pkg
Pkg.add("IndexedTables")
using IndexedTables
t = table((x = 1:100, y = randn(100)))
select(t, :x)
filter(row -> row.y > 0, t)
IndexedTable
vs. NDSparse
First let's create some data to work with.
using Dates
city = vcat(fill("New York", 3), fill("Boston", 3))
dates = repeat(Date(2016,7,6):Day(1):Date(2016,7,8), 2)
vals = [91, 89, 91, 95, 83, 76]
pkey
.julia> t1 = table((city = city, dates = dates, values = vals); pkey = [:city, :dates]) Table with 6 rows, 3 columns: city dates values ────────────────────────────── "Boston" 2016-07-06 95 "Boston" 2016-07-07 83 "Boston" 2016-07-08 76 "New York" 2016-07-06 91 "New York" 2016-07-07 89 "New York" 2016-07-08 91
julia> t11
### NDSparse
- Sorted by index variables (first argument).
- Data is accessed as an N-dimensional sparse array with arbitrary indexes.
julia> t2 = ndsparse((city=city, dates=dates), (value=vals,)) 2-d NDSparse with 6 values (1 field named tuples): city dates │ value ───────────────────────┼────── "Boston" 2016-07-06 │ 95 "Boston" 2016-07-07 │ 83 "Boston" 2016-07-08 │ 76 "New York" 2016-07-06 │ 91 "New York" 2016-07-07 │ 89 "New York" 2016-07-08 │ 91
julia> t2"Boston", Date(2016, 7, 6)
## Get started
For more information, check out the [JuliaDB Documentation](http://juliadb.org/latest/index.html).
03/10/2016
29 days ago
393 commits