Update Dec. 2018: Mocha.jl is now deprecated. The latest version works with Julia v0.6. If you have existing legacy codebase with Mocha that you want to updates for Julia v1.0, the pull request 255 contains fixes for CPU backend only that have all the unit tests passed under Julia v1.0.
The development of Mocha.jl happens in relative early days of Julia. Now that both Julia and the ecosystem has evolved significantly, and with some exciting new tech such as writing GPU kernels directly in Julia and general auto-differentiation supports, the Mocha codebase becomes excessively old and primitive. Reworking Mocha with new technologies requires some non-trivial efforts, and new exciting solutions already exist nowadays, it is a good time for the retirement of Mocha.jl.
If you are interested in doing deep learning with Julia, please check out some alternative packages that are more up-to-date and actively maintained. In particular, there are Knet.jl and Flux.jl for pure-Julia solutions, and MXNet.jl and Tensorflow.jl for wrapper to existing deep learning systems.
Mocha is a Deep Learning framework for Julia, inspired by the C++ framework Caffe. Efficient implementations of general stochastic gradient solvers and common layers in Mocha can be used to train deep / shallow (convolutional) neural networks, with (optional) unsupervised pre-training via (stacked) auto-encoders. Some highlights:
To install the release version, simply run
on the Julia console. To install the latest development version, run the following command instead:
Then you can run the built-in unit tests with
to verify that everything is functioning properly on your machine.
Please refer to the MNIST tutorial on how to prepare the MNIST dataset for the following example. The complete code for this example is located at
examples/mnist/mnist.jl. See below for detailed documentation of other tutorials and user guide.
using Mocha data = HDF5DataLayer(name="train-data",source="train-data-list.txt",batch_size=64) conv = ConvolutionLayer(name="conv1",n_filter=20,kernel=(5,5),bottoms=[:data],tops=[:conv]) pool = PoolingLayer(name="pool1",kernel=(2,2),stride=(2,2),bottoms=[:conv],tops=[:pool]) conv2 = ConvolutionLayer(name="conv2",n_filter=50,kernel=(5,5),bottoms=[:pool],tops=[:conv2]) pool2 = PoolingLayer(name="pool2",kernel=(2,2),stride=(2,2),bottoms=[:conv2],tops=[:pool2]) fc1 = InnerProductLayer(name="ip1",output_dim=500,neuron=Neurons.ReLU(),bottoms=[:pool2], tops=[:ip1]) fc2 = InnerProductLayer(name="ip2",output_dim=10,bottoms=[:ip1],tops=[:ip2]) loss = SoftmaxLossLayer(name="loss",bottoms=[:ip2,:label]) backend = DefaultBackend() init(backend) common_layers = [conv, pool, conv2, pool2, fc1, fc2] net = Net("MNIST-train", backend, [data, common_layers..., loss]) exp_dir = "snapshots" solver_method = SGD() params = make_solver_parameters(solver_method, max_iter=10000, regu_coef=0.0005, mom_policy=MomPolicy.Fixed(0.9), lr_policy=LRPolicy.Inv(0.01, 0.0001, 0.75), load_from=exp_dir) solver = Solver(solver_method, params) setup_coffee_lounge(solver, save_into="$exp_dir/statistics.jld", every_n_iter=1000) # report training progress every 100 iterations add_coffee_break(solver, TrainingSummary(), every_n_iter=100) # save snapshots every 5000 iterations add_coffee_break(solver, Snapshot(exp_dir), every_n_iter=5000) # show performance on test data every 1000 iterations data_test = HDF5DataLayer(name="test-data",source="test-data-list.txt",batch_size=100) accuracy = AccuracyLayer(name="test-accuracy",bottoms=[:ip2, :label]) test_net = Net("MNIST-test", backend, [data_test, common_layers..., accuracy]) add_coffee_break(solver, ValidationPerformance(test_net), every_n_iter=1000) solve(solver, net) destroy(net) destroy(test_net) shutdown(backend)
The Mocha documentation is hosted at readthedocs.org.
6 days ago