The past three months have been an amazing ride, and have filled me with a great deal of excitement, with so much
to learn at each and every stage. I got a chance to work with state of the art machine learning models in Computer Vision
and Natural Language Processing. The past few months
involved constant coding, debugging, testing and repeating this task again, until proper results were achieved.
I got to learn a lot in this process, about Julia, Flux and Machine
Learning in general.
My aim was to work on the Flux ecosystem. Flux is a purely Julian Machine Learning framework, which makes the task of
implementing
neural nets is as simple as writing mathematical equations. Flux makes Machine Learning easy to understand,
implement and more intuitive. However, one way to show the easy yet dynamic nature of Flux would be to load and
run models made in other advanced frameworks, such as TensorFlow, PyTorch, MXNet, Caffe. My project lies at the heart
of this problem. The aim of my project was to implement readers for loading and running advanced and state of the art models
in Flux.
I had proposed to work on two such packages: ONNX.jl and Keras.jl. I'm glad I was able
to meet the objectives, along with making contributions to other supporting packages in the Flux ecosystem,
such as Flux.jl, MetalHead.jl and TextAnalysis.jl to name a few. In this blog, I'd like to list my work, the PRs that still
need some work, and discuss the future of these packages.
ONNX.jl
To know more about ONNX.jl, please refer to my ONNX specific blog post
here. The coding period started with working on ONNX.jl. My target for ONNX.jl was to make
it good enough to be used to load supported models into Flux. ONNX.jl was pretty much non-existent before, so
I had to write most of it from scratch. Now the main issue here is that the main package, onnx
is itself under development
at this stage. I had to keep an eye out for the changes they made in their ecosystem, and how it affected ONNX.jl.
Apart from the operators, some of the ONNX models were also not ready for production, but I was able to load a few that
showed better results.
Again, the ONNX interface is very easy to use. All you need is the model.onnx file.
Loading this model in Flux is as simple as:
>>> using ONNX >>> ONNX.load_model("model.onnx") >>> weights = ONNX.load_weights("weights.bson") >>> final_model = include("model.jl")
And final_model is the desired Flux model.
At this stage, ONNX.jl be used to load and run models efficiently.
Links to commits:
As I said, I had to write most of the package from scratch, so you can find my commits here. However, I'd like to highlight a few major commits.
- Adding the ProtoBuf file
- Adding the New Julian data types
- Conversion to these newer types
- Extract weight from correct source.
- Basic Test module
- Logical Operators and corresponding tests.
Apart from this, I also wrote operator tests. In total, I was able to write tests for approximately 125 operators. I also opened a few issues in the main ONNX repository, which highlighted some bugs in models, errors and missing details in docs. These issues can be found here.
ONNX.jl also has four model tests at the moment, for the following models:
- SqueezeNet
- Emotion Ferplus
- VGG19
- MNIST Classifier
Future Work:
As I mentioned, ONNX is under development, so we'd have to keep track of the changes taking place in ONNX and how they could relate to ONNX.jl. Apart from this, operator tests also need to be added as and when they are updated, and also supported by the Flux framework.
Keras.jl
The intention behind Keras.jl was to load and run keras models into Flux.
I was successfully able to load and run quite a few Keras models using Keras.jl. This involved mapping each Keras layer to
corresponding Flux operator, and converting them into a Julian graph. The models were related to
Computer Vision (ResNet), Natural Language Processing (Text generation, Sentiment Analysis with and without
GloVe embeddings), Stock and Bitcoin price prediction. Keras.jl doesn't only let you analyse results of a Keras model with a
Flux backend, but is also helpful in debugging models, such as finding the output of a model, layer by layer (not only all
the layers combined). It also provides a pretty simple interface to load a model. Running a model typically requires two files:
- The weights.h5 model weight file.
- The structure.json model structure file.
Loading a model is as simple as:
>>> using Keras >>> model = Keras.load("structure.json", "weights.h5")And model is your corresponding model. It can be called with proper inputs to get the output.
Though most of the code was written by me, I'd like to list a few major commits:
- Basic support for Sequential API Sequential models are pretty straightforward to deal with, since they contain a straight chain of operators.
- Basic functional API support. Functional models can be a bit tricky, since the computation graph they generate can be a bit complicated. I added basic support for functional models in this PR.
- Docs and miscellaneous changes.
- IMdB sentiment analysis example.
- Text Generation
Since Keras-TensorFlow is a fairly evolved framework, Flux might need to add support for complicated operators in the future. The work in Keras.jl remains to update this package with the changes in Flux.
Miscellaneous Contributions:
I added computer vision models, generated by ONNX.jl to the MetalHead.jl. This included 4 models:
- SqueezeNet
- DenseNet
- ResNet
- GoogleNet
Since I was working with pre-trained models, I also used Keras.jl to work on a PR in TextAnalysis.jl. The aim was to implement a Sentiment Analysis functionality, with a simple interface. The changes can be found here, from where the work was continued on a separate PR. I also have a few unmerged PRs at this moment:
- Support for Local Response Normalization.
- Export to Julia-0.7 in DataFlow.
- Export to Julia-0.7 in Lazy.
Apart from pull requests, I also opened a few issue highlighting changes and additions required in Flux. As Flux
evolves as a simple to use yet powerful framework, it'll be important to work on these issues.
A few of the issues that still need work are:
- Grouped Convolutions support in Flux.
- Type of data changes as it passes through a specific layer.
- Support for asymmetric padding.
How this project helps the community
Flux is an amazing framework. It combines the speed and elegance of Julia with its simple yet powerful structure,
to make training models, implementing layers as simple as writing just a few lines of code. As Flux gains popularity
as a major machine learning framework, we need to make sure people realize it's power. One way to do this is to import models
made in other diverse frameworks, such as TensorFlow, PyTorch, CoreML into Flux. This is where
my project comes in.
With ONNX.jl and Keras.jl, you can easily run and load diverse models into Flux. These can then be used
in other packages
providing a suite of computer vision functionalities, such as Metalhead.jl.
Conclusion:
These three months made me realize how vast, deep and interesting the field of Machine Learning is. I'm glad
I was able to meet all objectives I had proposed. Packages such as ONNX.jl would need constant changes, until ONNX
reaches a stable release. I'll continue working in this direction in the future, hoping that the experience I gained
while writing most of ONNX.jl during the past three months helps me overcome most of the future challenges.
I'd like to thank my mentors for this project,
Phil Tomson and
Mike Innes for their support and guidance. I'd also like to thank
James Bradbury for helping me whenever I faced any issues. The Julia community were truly a pleasure to work with. I hope
I continue learning from them in the future. Shout-out to the entire Julia community for organizing an amazing JuliaCon this
year, and for sponsoring my trip!