Skip to Content

Prepping for the Oracle AI Cloud: Libraries and Tools

Capgemini
2018-05-29

The Oracle PaaS cloud will be extended with Artificial Intelligence (AI) capabilities. In the previous blog The Business case for the Oracle AI Cloud the capabilities and usage scenarios are described. The Oracle AI cloud is supported by open source libraries based upon Python and the strong Oracle’s strong cloud Integration capabilities:

  • Libraries and Tool, containing the Python libraries that are crucial for complex operations on large data sets
  • Deep learning Frameworks, with Tensorflow / Keras originating from Google supporting neural networks for deep analysis of data
  • Elastic AI and machine learning Infrastructure  is underpinning the platform by a rich set of high performance components

What do you need to do in order to be prepared to work with the Oracle AI cloud? This blog will look at Python and the first area “Libraries and Tools”

Python

Python is the de-facto choice in the scientific and data science community. In these communities, multiple libraries are developed that support complex computational operations on large data sets. The syntax of Python is designed to be clear and readable, and is summarized in a Zen of Python. This Zen can be visualized by running “import this” in Python.

Multiple resources (books, YouTube videos, and online tutorials) help you learn Python. Coming from a PLSQL and Java background, I found it really easy to learn Python and progress fast with it.

Jupyter

Jupyter Notebook (formerly IPython Notebooks) is a web-based interactive computational environment for creating, executing, and visualizing Jupyter notebooks, supporting multiple languages, Python being one of them. As described in What is Jupyter : Jupyter Notebooks are revolutionizing the way engineers and data scientists work together… it’s a tool for collaborating. It’s built for writing and sharing code and text, within the context of a web page. That is really the power of Jupyter; you write and execute your code in a browser, enrich it with Markup, collaborate with others, and share the entire website as a notebook.

Anaconda

As with other languages, there are different ways to develop and run applications. With the multitude of libraries for Python, you need an environment that supports keeping track of changes in libraries but also directly supports the most important libraries. Of course, you can manage libraries in Python with pip or conda. A tool that really speeds things up is Anaconda. This tool contains out-of-the-box, crucial libraries and a library manager and allows you to easily run Jupyter Notebooks.

Python Libraries

A short overview of the Python Libraries mentioned in the Oracle AI Platform

  • Numpy (Numerical Python) is the base class for numerical computing in Python (array handling, math functions, etcetera)
  • Pandas  (Panel Data and Python Data Analysis) is designed to work with tabular and heterogeneous data, and in a way the equivalent of Excel in Python.
  • Matplotlib, not mentioned in the picture but most likely part of the AI stack delivers rich visualisation capabilities
  • OpenCSV deals with CSV (comma-separated values) parser library for Java
  • Pillow is the Python Imaging Library
  • scikit-learn, contains Machine Learning in Python

Study Resources

  1. Pythonista editor on iPad
    I started with a Dutch Python Book (Handboek Python or Python Apprentice) and worked on my iPad with the Pythonista editor
  2. Python for Data Analysis (O’Reilly)
    Good explanation of Numpy/Pandas/Matplotlib and a short intro on Python and Jupyter
  3. Podcast https://talkpython.fm
    My favorite weekly, hour-long podcast where usage of Python in different industries is discussed.

Oracle selected a set of Libraries and Tools for the Python Data Science and Machine Learning ecosystem for the Oracle AI Platform. This is a smart move since there is a very large scientific and data science and community that have been developing all sorts of libraries to support data crunching on large data sets. In the next blog, we will dive into the machine learning library scikit-learn.

This blog series was co-authored by Léon Smiers and Johan Louwers. Léon Smiers is an Oracle ACE and a thought leader on Oracle cloud within Capgemini. Johan Louwers is an Oracle ACE director and global chief architect for Oracle technology. Both can be contacted for more information about this, and other topics, via email; Leon.Smiers@capgemini.com and Johan.Louwers@capgemini.com