
.. _getting-started:

===============
Getting started
===============

This getting started guide walks you, the data scientist, through using Fusion
for the first time after it has been installed and activated.

After completing this guide, you will be able to:

* Open a demo data file and run a clustering algorithm in Excel.

* Execute an interactive plot of the clustering results in Fusion.

* Create a Jupyter Notebook to export functions from Anaconda to Excel.

Before you start
================

If you have not yet installed and started Fusion, you must do so before using
this guide. For more information, see :doc:`../install/index`.

Opening the Clustering notebook on Windows
------------------------------------------

To open the notebook named ``Clustering``:

#. Click Start and then select the Fusion Example Spreadsheet icon.

#. Open the ``clustering.xlsx`` demo spreadsheet in Excel 2016.

#. In the Excel ribbon, click the **Insert** tab then select My Add-ins.

#. In the Office Add-ins window, click the **Shared Folder** tab and select Anaconda Fusion.

#. Click the OK button.

#. In the Fusion pane, click the **Notebooks** tab.

#. In the example notebooks list, select clustering.ipynb.


Opening the Clustering notebook on macOS
----------------------------------------

To open the notebook named ``Clustering``:

#. Open the terminal window, and run the command ``fusion-examples``.

#. In the Finder window, select the ``clustering.xlsx`` spreadsheet and open it in Excel 2016.

#. In the Excel ribbon, click the **Insert** tab then go to My Add-ins > Anaconda Fusion to activate the Fusion Add-in:

   .. figure:: /img/fusion_menu_activation_mac.png
    :width: 50%

   |

#. In the Fusion pane, click the **Notebooks** tab.

#. In the example notebooks list, select clustering.ipynb.


Executing the clustering algorithm in Excel
===========================================

In the following clustering demonstration you will learn how to execute a
code created by a data scientist directly from Excel. You will execute a Python
code to run a machine learning algorithm on your data and visualize the
resulting output.

Creating a dataset from Excel data
----------------------------------

To select data in the spreadsheet and make it visible to Fusion as a named
dataset:

#. In the ``NOISY_CIRCLES`` table of the ``clustering.ipynb`` spreadsheet, find the columns ``x`` and ``y``.

#. Highlight at least 100 rows of those two columns--without selecting the ``x`` and ``y`` column headers.

#. In the **Fusion** pane, click the **Data** tab, then select Current Selection:

   .. figure:: /img/fusion_add_data.png
    :width: 50%

   |

#. In the Name field, type ``noisy_circles_small`` to name the dataset you selected in step 2.

#. Click the Confirm button to save the dataset.

   .. figure:: /img/fusion_confirm_add_data.png
    :width: 50%

   |

#. Click the **clustering.ipynb** tab at the bottom of the pane.

Running the clustering algorithm from the Fusion pane
-----------------------------------------------------

#. In the list at the top of the the **Fusion** pane, select clustering.

#. In the three lists under **Inputs**, select the following parameters for the algorithm:

   * In the Select Data list, select your ``noisy_small_circles`` dataset.
   * In the Select Algorithm list, select ``MiniBatchKMeans``.
   * In the n_clusters list, leave the default selection ``---``.

   .. figure:: /img/fusion_clustering_params.png
        :width: 50%

   |

#. Click the Run button to produce a plot of the clustering results in Fusion:

   .. figure:: /img/fusion_clustering_plot.png
    :width: 50%

   |

The Python code you executed from within Excel runs a machine learning
algorithm on your data and visualizes its output.

Running the clustering algorithm in the Excel formula bar
---------------------------------------------------------

To run the clustering algorith in the Excel formula bar:

NOTE: Parameters in brackets are optional and their default values will be used
if you do not specify new ones.

#. Select an empty cell and type ``clustering(data, [algorithm], [n_clusters])``.

   EXAMPLE: ``=clustering(B3:C1502, "MiniBatchKMeans", 5)``.

#. Press the Enter key to execute the algorithm.

The Python code you executed from within Excel runs a machine learning algorithm
on your data and visualizes its output.

Creating a notebook to export functions to Excel
================================================

If you already use Python for data analysis and want to make your code available
for your coworkers using Excel, this section will teach you how to do so. It
will demonstrate how to create Python code that others can open and execute
using Excel.

NOTE: The following script will calculate and display the sum of the even
numbers in an Excel list.

To run this script:

#. If the Fusion server is not running, click start and select the Fusion icon to display a black window with white text containing the Fusion server log.

#. Open a browser.

#. Type the address of the Jupyter kernel that was installed with Fusion--by default, the address is ``https://localhost:9888``:

   TIP: If you did not use the default address, you may find the address by opening Fusion window and finding the ``localhost:...`` string. This is the address you should type in your browser.

   .. figure:: /img/fusion_jupyter_opening.png
    :width: 50%

   |

#. In your Jupyter Notebooks window, click the **Files** tab.

#. On the **Files** tab, click on the Notebooks folder.

#. Click the New button and select Python [default] to display a new Jupyter Notebook with the Python 3 kernel.

#. In the new notebook's single cell, type:

   .. code-block:: python

      from anacondafusion.fusion import fusion

      @fusion.register()
      def add_evens(data):
        total = 0
        for row in data:
            for item in row:
                if item % 2 == 0:
                    total = total+item
        return total

    TIP: The ``add_evens`` function is exposed to Excel with the ``@fusion.register`` decorator.

    .. figure:: /img/fusion_jupyter_function.png
      :width: 50%

   |

#. Click on File and select Save and Checkpoint.

   NOTE: At the top left, next to the Jupyter symbol, is the notebook name followed by "Last Checkpoint:..."--by default, the notebook name is ``Untitled``.

The ``add_evens`` function is saved and can now be used.

Using the Jupyter Notebooks interface
=====================================

The Jupyter Notebooks interface in your browser is the best way to create, edit,
and delete your Fusion notebooks.

#. In the Excel ribbon, click the **Insert** tab then select My Add-ins.

#. In the Office Add-ins window, click the **Shared Folder** tab and select Anaconda Fusion.

#. Click the OK button. 

#. Click the **Notebooks** tab and then select the notebook you saved in the last section--by default it was named ``Untitled.ipynb``.

#. In the list with the add_evens function displayed, verify that ``add_evens`` is selected.

#. On the blank Excel worksheet, in cells ``B2:E2`` type ``1``, ``2``, ``3``, and ``4``.

#. Select ``B2:E2``.

#. In Fusion, name the dataset ``mydata``.

#. In the **Fusion** pane, select Data and Current Selection.

#. In the Name field, type ``mydata``.

#. Click the Confirm button to define mydata as the dataset you selected in Excel which creates an object accessible to Jupyter that points to your Excel dataset.

#. In Excel, click on cell ``C4``.

   NOTE: This cell will contain the result.

#. Call the ``add_evens`` function on ``mydata`` by clicking Fusion's **Inputs** list, and selecting mydata.

#. Click the Run button.

In Excel, in the blank cell ``C4`` that you selected, you should see the sum ``6``.


More practice
=============

At this point you may want to go back to your browser and look at the first
example to see how we wrote the ``clustering.ipynb`` notebook.

Other Output options
====================

In the Output section of Fusion, the Options link is displayed. Clicking this
link displays the Select Default Output list which sets the default way Fusion
outputs data to Excel--either Selection or Cell/Range.

The Output section also displays the Export link which exports the most recent
result from Fusion to Excel.

The Select Export Destination list also offers a choice of Selection or
Cell/Range.

Next steps
==========

Now that you have a basic understanding of how to use Fusion, you are ready to
start performing some user-specific tasks.

For more information, see :doc:`bus-analyst/index` and/or :doc:`data-scientist/index`.
