Troubleshooting (AEN 4.1.1)
===========================

.. raw:: html

    <div class="section" id="overview">
    <h2>Overview<a class="headerlink" href="#overview" title="Permalink to this headline">¶</a></h2>
    <p>This is a troubleshooting guide for a Anaconda Enterprise Notebooks
    deployment.</p>
    </div>
    <div class="section" id="normal-operation">
    <span id="aen-troubleshooting-normal-operation"></span><h2>Normal Operation<a class="headerlink" href="#normal-operation" title="Permalink to this headline">¶</a></h2>
    <div class="section" id="server">
    <h3>Server<a class="headerlink" href="#server" title="Permalink to this headline">¶</a></h3>
    <p>Anaconda Enterprise Notebooks Server is installed in
    <code class="docutils literal"><span class="pre">/opt/wakari/wakari-server</span></code>.</p>
    <p>You can get the status of the server processes with:</p>
    <div class="highlight-default"><div class="highlight"><pre><span></span><span class="c1"># service wakari-server status</span>
    <span class="n">wk</span><span class="o">-</span><span class="n">server</span>                        <span class="n">RUNNING</span>    <span class="n">pid</span> <span class="mi">20758</span><span class="p">,</span> <span class="n">uptime</span> <span class="mi">5</span> <span class="n">days</span><span class="p">,</span> <span class="mi">0</span><span class="p">:</span><span class="mi">30</span><span class="p">:</span><span class="mi">23</span>
    <span class="n">worker</span>                           <span class="n">RUNNING</span>    <span class="n">pid</span> <span class="mi">20757</span><span class="p">,</span> <span class="n">uptime</span> <span class="mi">5</span> <span class="n">days</span><span class="p">,</span> <span class="mi">0</span><span class="p">:</span><span class="mi">30</span><span class="p">:</span><span class="mi">23</span>
    </pre></div>
    </div>
    <p>or:</p>
    <div class="highlight-default"><div class="highlight"><pre><span></span>root@server # ps -Hu wakari
      PID TTY          TIME CMD
    20756 ?        00:02:26 .supervisord
    20757 ?        00:05:58   mtq-worker
    20758 ?        00:00:08   wk-server
    20765 ?        00:02:00     wk-server
    20766 ?        00:01:55     wk-server
    20767 ?        00:02:20     wk-server
    20770 ?        00:02:02     wk-server
    </pre></div>
    </div>
    <table border="1" class="docutils">
    <colgroup>
    <col width="20%" />
    <col width="80%" />
    </colgroup>
    <thead valign="bottom">
    <tr class="row-odd"><th class="head">supervisord</th>
    <th class="head">details</th>
    </tr>
    </thead>
    <tbody valign="top">
    <tr class="row-even"><td>description</td>
    <td>Manages <code class="docutils literal"><span class="pre">wakari-worker</span></code> and multiple processes of <code class="docutils literal"><span class="pre">wk-server</span></code></td>
    </tr>
    <tr class="row-odd"><td>user</td>
    <td><code class="docutils literal"><span class="pre">wakari</span></code></td>
    </tr>
    <tr class="row-even"><td>configuration</td>
    <td><code class="docutils literal"><span class="pre">/opt/wakari/wakari-server/etc/supervisord.conf</span></code></td>
    </tr>
    <tr class="row-odd"><td>log</td>
    <td><code class="docutils literal"><span class="pre">/opt/wakari/wakari-server/var/log/supervisord.log</span></code></td>
    </tr>
    <tr class="row-even"><td>control</td>
    <td><code class="docutils literal"><span class="pre">service</span> <span class="pre">wakari-server</span></code></td>
    </tr>
    <tr class="row-odd"><td>ports</td>
    <td>none</td>
    </tr>
    </tbody>
    </table>
    <table border="1" class="docutils">
    <colgroup>
    <col width="14%" />
    <col width="86%" />
    </colgroup>
    <thead valign="bottom">
    <tr class="row-odd"><th class="head">wk-server</th>
    <th class="head">details</th>
    </tr>
    </thead>
    <tbody valign="top">
    <tr class="row-even"><td>description</td>
    <td>Handles user interaction and passing jobs on to the wakari gateway. Access to it is managed by nginx.</td>
    </tr>
    <tr class="row-odd"><td>user</td>
    <td><code class="docutils literal"><span class="pre">wakari</span></code></td>
    </tr>
    <tr class="row-even"><td>command</td>
    <td><code class="docutils literal"><span class="pre">/opt/wakari/wakari-server/bin/wk-server</span></code></td>
    </tr>
    <tr class="row-odd"><td>configuration</td>
    <td><code class="docutils literal"><span class="pre">/opt/wakari/wakari-server/etc/wakari/</span></code></td>
    </tr>
    <tr class="row-even"><td>control</td>
    <td><code class="docutils literal"><span class="pre">service</span> <span class="pre">wakari-server</span></code></td>
    </tr>
    <tr class="row-odd"><td>logs</td>
    <td><code class="docutils literal"><span class="pre">/opt/wakari/wakari-server/var/log/wakari/server.log</span></code></td>
    </tr>
    <tr class="row-even"><td>ports</td>
    <td>5000 (only on localhost)</td>
    </tr>
    </tbody>
    </table>
    <table border="1" class="docutils">
    <colgroup>
    <col width="23%" />
    <col width="77%" />
    </colgroup>
    <thead valign="bottom">
    <tr class="row-odd"><th class="head">wakari-worker</th>
    <th class="head">details</th>
    </tr>
    </thead>
    <tbody valign="top">
    <tr class="row-even"><td>description</td>
    <td>Asynchronously executes tasks from <code class="docutils literal"><span class="pre">wk-server</span></code></td>
    </tr>
    <tr class="row-odd"><td>user</td>
    <td><code class="docutils literal"><span class="pre">wakari</span></code></td>
    </tr>
    <tr class="row-even"><td>logs</td>
    <td><code class="docutils literal"><span class="pre">/opt/wakari/wakari-server/var/log/wakari/worker.log</span></code></td>
    </tr>
    <tr class="row-odd"><td>control</td>
    <td><code class="docutils literal"><span class="pre">service</span> <span class="pre">wakari-server</span></code></td>
    </tr>
    </tbody>
    </table>
    <table border="1" class="docutils">
    <colgroup>
    <col width="12%" />
    <col width="88%" />
    </colgroup>
    <thead valign="bottom">
    <tr class="row-odd"><th class="head">nginx</th>
    <th class="head">details</th>
    </tr>
    </thead>
    <tbody valign="top">
    <tr class="row-even"><td>description</td>
    <td>Serves static files and acts as proxy for all other requests which are passed to wk-server process running on port 5000.</td>
    </tr>
    <tr class="row-odd"><td>user</td>
    <td>nginx</td>
    </tr>
    <tr class="row-even"><td>configuration</td>
    <td><code class="docutils literal"><span class="pre">/etc/nginx/nginx.conf</span></code>
    <code class="docutils literal"><span class="pre">/opt/wakari/wakari-server/etc/conf.d/www.enterprise.conf</span></code></td>
    </tr>
    <tr class="row-odd"><td>logs</td>
    <td><code class="docutils literal"><span class="pre">/var/log/nginx/woc.log</span></code> <code class="docutils literal"><span class="pre">/var/log/nginx/woc-error.log</span></code></td>
    </tr>
    <tr class="row-even"><td>control</td>
    <td><code class="docutils literal"><span class="pre">service</span> <span class="pre">nginx</span> <span class="pre">status</span></code></td>
    </tr>
    <tr class="row-odd"><td>port</td>
    <td>80</td>
    </tr>
    </tbody>
    </table>
    <p>Nginx runs at least two processes: - master process running as root user
    - worker processes running as nginx user</p>
    </div>
    </div>
    <div class="section" id="gateway">
    <h2>Gateway<a class="headerlink" href="#gateway" title="Permalink to this headline">¶</a></h2>
    <p>Anaconda Enterprise Notebooks Gateway is installed in
    <code class="docutils literal"><span class="pre">/opt/wakari/wakari-gateway</span></code>.</p>
    <p>You can get the status of the gateway processes with:</p>
    <div class="highlight-default"><div class="highlight"><pre><span></span><span class="c1"># service wakari-gateway status</span>
    <span class="n">wk</span><span class="o">-</span><span class="n">gateway</span>                       <span class="n">RUNNING</span>    <span class="n">pid</span> <span class="mi">1137</span><span class="p">,</span> <span class="n">uptime</span> <span class="mi">5</span> <span class="n">days</span><span class="p">,</span> <span class="mi">1</span><span class="p">:</span><span class="mi">59</span><span class="p">:</span><span class="mi">28</span>
    </pre></div>
    </div>
    <p>or:</p>
    <div class="highlight-default"><div class="highlight"><pre><span></span>root@gateway # ps -Hu wakari
      PID TTY          TIME CMD
     1136 ?        00:01:59 .supervisord
     1137 ?        00:00:02   wk-gateway
    </pre></div>
    </div>
    <table border="1" class="docutils">
    <colgroup>
    <col width="23%" />
    <col width="77%" />
    </colgroup>
    <thead valign="bottom">
    <tr class="row-odd"><th class="head">supervisord</th>
    <th class="head">details</th>
    </tr>
    </thead>
    <tbody valign="top">
    <tr class="row-even"><td>description</td>
    <td>Manages the <code class="docutils literal"><span class="pre">wk-gateway</span></code> process.</td>
    </tr>
    <tr class="row-odd"><td>user</td>
    <td><code class="docutils literal"><span class="pre">wakari</span></code></td>
    </tr>
    <tr class="row-even"><td>configuration</td>
    <td><code class="docutils literal"><span class="pre">/opt/wakari/wakari-gateway/etc/supervisord.conf</span></code></td>
    </tr>
    <tr class="row-odd"><td>log</td>
    <td><code class="docutils literal"><span class="pre">/opt/wakari/wakari-gateway/var/log/supervisord.log</span></code></td>
    </tr>
    <tr class="row-even"><td>control</td>
    <td><code class="docutils literal"><span class="pre">service</span> <span class="pre">wakari-gateway</span></code></td>
    </tr>
    <tr class="row-odd"><td>ports</td>
    <td>none</td>
    </tr>
    </tbody>
    </table>
    <table border="1" class="docutils">
    <colgroup>
    <col width="18%" />
    <col width="82%" />
    </colgroup>
    <thead valign="bottom">
    <tr class="row-odd"><th class="head">wakari-gateway</th>
    <th class="head">details</th>
    </tr>
    </thead>
    <tbody valign="top">
    <tr class="row-even"><td>description</td>
    <td>Passes requests from Anaconda Enterprise Notebooks Server to the Compute Nodes.</td>
    </tr>
    <tr class="row-odd"><td>user</td>
    <td><code class="docutils literal"><span class="pre">wakari</span></code></td>
    </tr>
    <tr class="row-even"><td>configuration</td>
    <td><code class="docutils literal"><span class="pre">/opt/wakari/wakari-gateway/etc/wakari/wk-gateway-config.json</span></code></td>
    </tr>
    <tr class="row-odd"><td>logs</td>
    <td><dl class="first last docutils">
    <dt><code class="docutils literal"><span class="pre">/opt/wakari/wakari-gateway/var/log/wakari/gateway.application.log</span></code></dt>
    <dd><code class="docutils literal"><span class="pre">/opt/wakari/wakari-gateway/var/log/wakari/gateway.log</span></code></dd>
    </dl>
    </td>
    </tr>
    <tr class="row-even"><td>working dir</td>
    <td><code class="docutils literal"><span class="pre">/</span></code> (root)</td>
    </tr>
    <tr class="row-odd"><td>port</td>
    <td>8089 (webcache)</td>
    </tr>
    </tbody>
    </table>
    </div>
    <div class="section" id="compute-node">
    <h2>Compute Node<a class="headerlink" href="#compute-node" title="Permalink to this headline">¶</a></h2>
    <p>Anaconda Enterprise Notebooks Compute is installed in
    <code class="docutils literal"><span class="pre">/opt/wakari/wakari-compute</span></code>.</p>
    <p>You can get the status of the compute node processes with:</p>
    <div class="highlight-default"><div class="highlight"><pre><span></span><span class="c1"># service wakari-compute status</span>
    <span class="n">wk</span><span class="o">-</span><span class="n">compute</span>                       <span class="n">RUNNING</span>    <span class="n">pid</span> <span class="mi">22050</span><span class="p">,</span> <span class="n">uptime</span> <span class="mi">3</span> <span class="n">days</span><span class="p">,</span> <span class="mi">1</span><span class="p">:</span><span class="mi">03</span><span class="p">:</span><span class="mi">19</span>
    </pre></div>
    </div>
    <p>or:</p>
    <div class="highlight-default"><div class="highlight"><pre><span></span>root@compute # ps -Hu wakari
      PID TTY          TIME CMD
     1150 ?        00:02:01 .supervisord
     1152 ?        00:00:01   wk-compute
    </pre></div>
    </div>
    <p>wk-compute will load each of these configuration files, in order:</p>
    <ul class="simple">
    <li><code class="docutils literal"><span class="pre">/etc/wakari/config.json</span></code></li>
    <li><code class="docutils literal"><span class="pre">/etc/wakari/compute-launcher-config.json</span></code></li>
    <li><code class="docutils literal"><span class="pre">./compute-launcher-config.json</span></code></li>
    <li>Config file specified by <code class="docutils literal"><span class="pre">-c</span></code> option</li>
    </ul>
    <p>If an option is specified in multiple files, the last one encountered
    takes precedence.</p>
    <table border="1" class="docutils">
    <colgroup>
    <col width="23%" />
    <col width="77%" />
    </colgroup>
    <thead valign="bottom">
    <tr class="row-odd"><th class="head">supervisord</th>
    <th class="head">details</th>
    </tr>
    </thead>
    <tbody valign="top">
    <tr class="row-even"><td>description</td>
    <td>Manages the <code class="docutils literal"><span class="pre">wk-compute</span></code> process.</td>
    </tr>
    <tr class="row-odd"><td>user</td>
    <td><code class="docutils literal"><span class="pre">wakari</span></code></td>
    </tr>
    <tr class="row-even"><td>configuration</td>
    <td><code class="docutils literal"><span class="pre">/opt/wakari/wakari-compute/etc/supervisord.conf</span></code></td>
    </tr>
    <tr class="row-odd"><td>log</td>
    <td><code class="docutils literal"><span class="pre">/opt/wakari/wakari-compute/var/log/supervisord.log</span></code></td>
    </tr>
    <tr class="row-even"><td>control</td>
    <td><code class="docutils literal"><span class="pre">service</span> <span class="pre">wakari-compute</span></code></td>
    </tr>
    <tr class="row-odd"><td>working dir</td>
    <td><code class="docutils literal"><span class="pre">/opt/wakari/wakari-compute/etc</span></code></td>
    </tr>
    <tr class="row-even"><td>ports</td>
    <td>none</td>
    </tr>
    </tbody>
    </table>
    <table border="1" class="docutils">
    <colgroup>
    <col width="18%" />
    <col width="82%" />
    </colgroup>
    <thead valign="bottom">
    <tr class="row-odd"><th class="head">wk-compute</th>
    <th class="head">details</th>
    </tr>
    </thead>
    <tbody valign="top">
    <tr class="row-even"><td>description</td>
    <td>Launches compute processes</td>
    </tr>
    <tr class="row-odd"><td>user</td>
    <td><code class="docutils literal"><span class="pre">wakari</span></code></td>
    </tr>
    <tr class="row-even"><td>configuration</td>
    <td><code class="docutils literal"><span class="pre">/opt/wakari/wakari-compute/etc/wakari/wk-compute-launcher-config.json</span></code>
    <code class="docutils literal"><span class="pre">/opt/wakari/wakari-compute/etc/wakari/scripts/config.json</span></code></td>
    </tr>
    <tr class="row-odd"><td>logs</td>
    <td><code class="docutils literal"><span class="pre">/opt/wakari/wakari-compute/var/log/wakari/compute-launcher.application.log</span></code>
    <code class="docutils literal"><span class="pre">/opt/wakari/wakari-compute/var/log/wakari/compute-launcher.log</span></code></td>
    </tr>
    <tr class="row-even"><td>working dir</td>
    <td><code class="docutils literal"><span class="pre">/</span></code> (root)</td>
    </tr>
    <tr class="row-odd"><td>control</td>
    <td><code class="docutils literal"><span class="pre">service</span> <span class="pre">wakari-compute</span></code></td>
    </tr>
    <tr class="row-even"><td>port</td>
    <td>5002 (rfe)</td>
    </tr>
    </tbody>
    </table>
    <div class="section" id="projects-and-permissions">
    <h3>Projects and Permissions<a class="headerlink" href="#projects-and-permissions" title="Permalink to this headline">¶</a></h3>
    <p>Projects live in the projectRoot folder on the compute node (by default,
    /projects). The project directory is created the first time the project
    is started; the start-project script clones it from
    <code class="docutils literal"><span class="pre">/opt/wakari/wakari-compute/lib/node_modules/wakari-compute-launcher/skeleton</span></code>.</p>
    <p>Project directory permissions are as follows:</p>
    <div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">owner</span><span class="p">:</span> <span class="n">rwx</span><span class="p">,</span> <span class="n">user</span> <span class="n">who</span> <span class="n">created</span> <span class="n">the</span> <span class="n">project</span>
    <span class="n">group</span><span class="p">:</span> <span class="n">rwx</span><span class="p">,</span> <span class="n">owner</span><span class="s1">&#39;s group</span>
    <span class="n">other</span><span class="p">:</span> <span class="o">--</span><span class="n">x</span><span class="p">,</span> <span class="n">to</span> <span class="n">allow</span> <span class="n">access</span> <span class="n">to</span> <span class="n">the</span> <span class="n">Public</span> <span class="n">folder</span>
    <span class="n">ACL</span><span class="p">:</span>   <span class="n">rwx</span> <span class="k">for</span> <span class="nb">any</span> <span class="n">other</span> <span class="n">team</span> <span class="n">members</span>
    </pre></div>
    </div>
    <p>Files and subdirectories within the project directory have the same
    permissions as the project directory, except:</p>
    <ol class="arabic simple">
    <li>The public folder and everything in it are world readable.</li>
    <li>Any files hardlinked into the root anaconda environment
    (<code class="docutils literal"><span class="pre">/opt/wakari/anaconda</span></code>) remain owned by the <code class="docutils literal"><span class="pre">root</span></code> or <code class="docutils literal"><span class="pre">wakari</span></code>
    users.</li>
    </ol>
    <p>Project file and directory permissions are maintained by the
    start-project script. All files and directories in the project will have
    their permissions set when the project is started, except for files
    owned by <code class="docutils literal"><span class="pre">root</span></code> or <code class="docutils literal"><span class="pre">wakari</span></code> (in order to avoid changing permissions
    of the linked files in <code class="docutils literal"><span class="pre">/opt/wakari/anaconda</span></code>). Because of this, the
    <code class="docutils literal"><span class="pre">wakari</span></code> user should not create projects that are intended to be
    shared with others, because the permissions system will not correctly
    manage project files owned by the <code class="docutils literal"><span class="pre">wakari</span></code> user.</p>
    </div>
    </div>
    <div class="section" id="general-troubleshooting-steps">
    <h2>General Troubleshooting Steps<a class="headerlink" href="#general-troubleshooting-steps" title="Permalink to this headline">¶</a></h2>
    <div class="section" id="ensure-that-the-anaconda-enterprise-notebooks-services-are-set-to-start-at-boot">
    <h3>Ensure that the Anaconda Enterprise Notebooks services are set to start at boot<a class="headerlink" href="#ensure-that-the-anaconda-enterprise-notebooks-services-are-set-to-start-at-boot" title="Permalink to this headline">¶</a></h3>
    <p>(on all 3 components: Server, Gateway, and Compute nodes)</p>
    <div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">chkconfig</span> <span class="o">--</span><span class="nb">list</span> <span class="o">|</span> <span class="n">grep</span> <span class="n">wakari</span>
    </pre></div>
    </div>
    <p>If they are missing, you can try adding them with:</p>
    <div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">chkconfig</span> <span class="o">--</span><span class="n">add</span> <span class="p">[</span><span class="n">wakari</span><span class="o">-</span><span class="n">server</span><span class="o">|</span><span class="n">wakari</span><span class="o">-</span><span class="n">gateway</span><span class="o">|</span><span class="n">wakari</span><span class="o">-</span><span class="n">compute</span><span class="p">]</span>
    </pre></div>
    </div>
    <p>Then services can be started safely with the <code class="docutils literal"><span class="pre">restart</span></code> command as
    follows:</p>
    <div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">service</span> <span class="n">wakari</span><span class="o">-</span><span class="n">server</span> <span class="n">restart</span>
    <span class="n">service</span> <span class="n">wakari</span><span class="o">-</span><span class="n">gateway</span> <span class="n">restart</span>
    <span class="n">service</span> <span class="n">wakari</span><span class="o">-</span><span class="n">compute</span> <span class="n">restart</span>
    </pre></div>
    </div>
    <p>These commands need to be executed on the appropriate nodes.</p>
    </div>
    <div class="section" id="ensure-that-all-services-are-running">
    <h3>Ensure that all services are running<a class="headerlink" href="#ensure-that-all-services-are-running" title="Permalink to this headline">¶</a></h3>
    <p>(see <a class="reference internal" href="#aen-troubleshooting-normal-operation"><span class="std std-ref">Normal Operation</span></a>, above).</p>
    <div class="highlight-default"><div class="highlight"><pre><span></span><span class="c1"># service wakari-server status</span>
    <span class="n">wk</span><span class="o">-</span><span class="n">server</span>                        <span class="n">RUNNING</span>    <span class="n">pid</span> <span class="mi">20758</span><span class="p">,</span> <span class="n">uptime</span> <span class="mi">5</span> <span class="n">days</span><span class="p">,</span> <span class="mi">0</span><span class="p">:</span><span class="mi">30</span><span class="p">:</span><span class="mi">23</span>
    <span class="n">worker</span>                           <span class="n">RUNNING</span>    <span class="n">pid</span> <span class="mi">20757</span><span class="p">,</span> <span class="n">uptime</span> <span class="mi">5</span> <span class="n">days</span><span class="p">,</span> <span class="mi">0</span><span class="p">:</span><span class="mi">30</span><span class="p">:</span><span class="mi">23</span>

    <span class="n">root</span><span class="nd">@server</span> <span class="c1"># service nginx status</span>
    <span class="n">nginx</span> <span class="p">(</span><span class="n">pid</span>  <span class="mi">26303</span><span class="p">)</span> <span class="ow">is</span> <span class="n">running</span><span class="o">...</span>

    <span class="c1"># service wakari-gateway status</span>
    <span class="n">wk</span><span class="o">-</span><span class="n">gateway</span>                       <span class="n">RUNNING</span>    <span class="n">pid</span> <span class="mi">1137</span><span class="p">,</span> <span class="n">uptime</span> <span class="mi">5</span> <span class="n">days</span><span class="p">,</span> <span class="mi">1</span><span class="p">:</span><span class="mi">59</span><span class="p">:</span><span class="mi">28</span>

    <span class="c1"># service wakari-compute status</span>
    <span class="n">wk</span><span class="o">-</span><span class="n">compute</span>                       <span class="n">RUNNING</span>    <span class="n">pid</span> <span class="mi">22050</span><span class="p">,</span> <span class="n">uptime</span> <span class="mi">3</span> <span class="n">days</span><span class="p">,</span> <span class="mi">1</span><span class="p">:</span><span class="mi">03</span><span class="p">:</span><span class="mi">19</span>
    </pre></div>
    </div>
    <p>If any of the processes are missing, restart them using the commands
    above.</p>
    </div>
    <div class="section" id="check-for-extraneous-processes">
    <h3>Check for Extraneous Processes<a class="headerlink" href="#check-for-extraneous-processes" title="Permalink to this headline">¶</a></h3>
    <p>Use <code class="docutils literal"><span class="pre">ps</span> <span class="pre">-Hu</span> <span class="pre">wakari</span></code> to get a complete list of the processes running
    under the <code class="docutils literal"><span class="pre">wakari</span></code> user account.</p>
    <div class="highlight-default"><div class="highlight"><pre><span></span>root@server # ps -Hu wakari
      PID TTY          TIME CMD
    20756 ?        00:02:26 .supervisord
    20757 ?        00:05:58   mtq-worker
    20758 ?        00:00:08   wk-server
    20765 ?        00:02:00     wk-server
    20766 ?        00:01:55     wk-server
    20767 ?        00:02:20     wk-server
    20770 ?        00:02:02     wk-server

    root@server # ps -f -C nginx
    UID        PID  PPID  C STIME TTY          TIME CMD
    root     26303     1  0 12:18 ?        00:00:00 nginx: master process /usr/sbin/nginx -c /etc/nginx/nginx.conf
    nginx    26305 26303  0 12:18 ?        00:00:00 nginx: worker process

    root@gateway # ps -Hu wakari
      PID TTY          TIME CMD
     1136 ?        00:01:59 .supervisord
     1137 ?        00:00:02   wk-gateway

    root@compute # ps -Hu wakari
      PID TTY          TIME CMD
     1150 ?        00:02:01 .supervisord
     1152 ?        00:00:01   wk-compute
    </pre></div>
    </div>
    <p>What&#8217;s normal:</p>
    <ul class="simple">
    <li>The wk-server, wk-gateway, and wk-compute processes should have the
    PIDs reported by <code class="docutils literal"><span class="pre">supervisorctl</span></code>.</li>
    <li>The nginx master process should have the PID reported by
    <code class="docutils literal"><span class="pre">service</span> <span class="pre">nginx</span> <span class="pre">status</span></code>.</li>
    <li>If you have installed more than one Anaconda Enterprise Notebooks component on a single
    machine, the processes from all of the installed components will show
    up on that machine.</li>
    <li>On the Compute node, any Anaconda Enterprise Notebooks applications currently being run by
    users will be present. For example:</li>
    </ul>
    <div class="highlight-default"><div class="highlight"><pre><span></span>root@compute # ps -Hu wakari
      PID TTY          TIME CMD
     1150 ?        00:00:00 .supervisord
     1152 ?        00:00:00   wk-compute
     1340 ?        00:00:00 bash
     1341 ?        00:00:00   notebookwrapper
    </pre></div>
    </div>
    <p>If extra wk-server, wk-gateway, wk-compute, or supervisord processes are
    present, use the <code class="docutils literal"><span class="pre">kill</span></code> command to remove them. Then restart the
    services using <code class="docutils literal"><span class="pre">service</span> <span class="pre">SERVICE_NAME</span> <span class="pre">restart</span></code> as described above.</p>
    </div>
    <div class="section" id="check-connectivity-between-the-servers">
    <h3>Check connectivity between the servers<a class="headerlink" href="#check-connectivity-between-the-servers" title="Permalink to this headline">¶</a></h3>
    <div class="section" id="server-to-gateways">
    <h4>Server to Gateways<a class="headerlink" href="#server-to-gateways" title="Permalink to this headline">¶</a></h4>
    <p>On the Server, navigate to Admin/Data Centers. For each data center in
    the list, check connectivity from the server to that gateway (in this
    example, the gateway is <code class="docutils literal"><span class="pre">http://gateway.example.com:8089</span></code>):</p>
    <div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">root</span><span class="nd">@server</span> <span class="c1"># curl --connect-timeout 5 http://gateway.example.com:8089 &gt; /dev/null</span>
    </pre></div>
    </div>
    </div>
    <div class="section" id="gateways-to-compute-nodes">
    <h4>Gateways to Compute Nodes<a class="headerlink" href="#gateways-to-compute-nodes" title="Permalink to this headline">¶</a></h4>
    <p>On the Server, navigate to Admin/Enterprise Resources. For each compute
    resource in the list, open it and check the contents of the URL field to
    ensure that it begins with either &#8220;http&#8221; or &#8220;https&#8221;. Check connectivity
    to that URL from the corresponding Gateway. For example, if the URL is
    <code class="docutils literal"><span class="pre">http://compute.example.com:5002</span></code>:</p>
    <div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">root</span><span class="nd">@gateway</span> <span class="c1"># curl --connect-timeout 5 http://compute.example.com:5002 &gt; /dev/null</span>
    </pre></div>
    </div>
    </div>
    <div class="section" id="gateways-to-server">
    <h4>Gateways to server<a class="headerlink" href="#gateways-to-server" title="Permalink to this headline">¶</a></h4>
    <p>This path is used by the gateway configuration command
    <code class="docutils literal"><span class="pre">wk-gateway-configure</span></code>. First, ensure that the gateway is linked to
    the correct server in the configuration file and that the full server
    URL is specified. Then check connectivity to the server.</p>
    <div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">root</span><span class="nd">@gateway</span> <span class="c1"># grep WAKARI_SERVER /opt/wakari/wakari-gateway/etc/wakari/wk-gateway-config.json</span>
      <span class="s2">&quot;WAKARI_SERVER&quot;</span><span class="p">:</span> <span class="s2">&quot;http://wakari.example.com&quot;</span><span class="p">,</span>

    <span class="n">root</span><span class="nd">@gateway</span> <span class="c1"># curl --connect-timeout 5 http://wakari.example.com &gt; /dev/null</span>
    <span class="n">root</span><span class="nd">@gateway</span> <span class="c1"># curl --connect-timeout 5 http://error.example.com &gt; /dev/null</span>
    <span class="n">curl</span><span class="p">:</span> <span class="p">(</span><span class="mi">7</span><span class="p">)</span> <span class="n">Failed</span> <span class="n">to</span> <span class="n">connect</span> <span class="n">to</span> <span class="n">error</span><span class="o">.</span><span class="n">example</span><span class="o">.</span><span class="n">com</span> <span class="n">port</span> <span class="mi">80</span><span class="p">:</span> <span class="n">Connection</span> <span class="n">refused</span>
    </pre></div>
    </div>
    <p>If a connection fails, check the following items:</p>
    <ul class="simple">
    <li>Ensure that Gateways (Data Centers) and Compute nodes (Enterprise
    Resources) are correctly configured on the server.</li>
    <li>Verify that processes are listening on the configured ports:</li>
    </ul>
    <div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">root</span><span class="nd">@server</span> <span class="c1"># netstat -plt</span>
    <span class="n">Active</span> <span class="n">Internet</span> <span class="n">connections</span> <span class="p">(</span><span class="n">only</span> <span class="n">servers</span><span class="p">)</span>
    <span class="n">Proto</span> <span class="n">Recv</span><span class="o">-</span><span class="n">Q</span> <span class="n">Send</span><span class="o">-</span><span class="n">Q</span> <span class="n">Local</span> <span class="n">Address</span>               <span class="n">Foreign</span> <span class="n">Address</span>             <span class="n">State</span>       <span class="n">PID</span><span class="o">/</span><span class="n">Program</span> <span class="n">name</span>
    <span class="n">tcp</span>        <span class="mi">0</span>      <span class="mi">0</span> <span class="o">*</span><span class="p">:</span><span class="n">http</span>                      <span class="o">*</span><span class="p">:</span><span class="o">*</span>                         <span class="n">LISTEN</span>      <span class="mi">26409</span><span class="o">/</span><span class="n">nginx</span>
    <span class="n">tcp</span>        <span class="mi">0</span>      <span class="mi">0</span> <span class="o">*</span><span class="p">:</span><span class="n">ssh</span>                       <span class="o">*</span><span class="p">:</span><span class="o">*</span>                         <span class="n">LISTEN</span>      <span class="mi">986</span><span class="o">/</span><span class="n">sshd</span>
    <span class="n">tcp</span>        <span class="mi">0</span>      <span class="mi">0</span> <span class="n">localhost</span><span class="p">:</span><span class="n">smtp</span>              <span class="o">*</span><span class="p">:</span><span class="o">*</span>                         <span class="n">LISTEN</span>      <span class="mi">1063</span><span class="o">/</span><span class="n">master</span>
    <span class="n">tcp</span>        <span class="mi">0</span>      <span class="mi">0</span> <span class="o">*</span><span class="p">:</span><span class="n">commplex</span><span class="o">-</span><span class="n">main</span>             <span class="o">*</span><span class="p">:</span><span class="o">*</span>                         <span class="n">LISTEN</span>      <span class="mi">26192</span><span class="o">/</span><span class="n">python</span>
    <span class="n">tcp</span>        <span class="mi">0</span>      <span class="mi">0</span> <span class="n">localhost</span><span class="p">:</span><span class="mi">27017</span>             <span class="o">*</span><span class="p">:</span><span class="o">*</span>                         <span class="n">LISTEN</span>      <span class="mi">29261</span><span class="o">/</span><span class="n">mongod</span>
    <span class="n">tcp</span>        <span class="mi">0</span>      <span class="mi">0</span> <span class="o">*</span><span class="p">:</span><span class="n">ssh</span>                       <span class="o">*</span><span class="p">:</span><span class="o">*</span>                         <span class="n">LISTEN</span>      <span class="mi">986</span><span class="o">/</span><span class="n">sshd</span>
    <span class="n">tcp</span>        <span class="mi">0</span>      <span class="mi">0</span> <span class="n">localhost</span><span class="p">:</span><span class="n">smtp</span>              <span class="o">*</span><span class="p">:</span><span class="o">*</span>                         <span class="n">LISTEN</span>      <span class="mi">1063</span><span class="o">/</span><span class="n">master</span>
    </pre></div>
    </div>
    <ul class="simple">
    <li>Check firewall settings/logs on both hosts to ensure that packets are
    not being blocked or discarded.</li>
    </ul>
    </div>
    </div>
    <div class="section" id="check-configuration-file-syntax">
    <h3>Check Configuration File Syntax<a class="headerlink" href="#check-configuration-file-syntax" title="Permalink to this headline">¶</a></h3>
    <p>Use this command to verify that the configuration file contains valid
    JSON:</p>
    <div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">root</span><span class="nd">@server</span>  <span class="c1"># python -m json.tool /opt/wakari/wakari-server/etc/wakari/*.json</span>
    <span class="n">root</span><span class="nd">@gateway</span> <span class="c1"># python -m json.tool /opt/wakari/wakari-gateway/etc/wakari/*.json</span>
    <span class="n">root</span><span class="nd">@compute</span> <span class="c1"># python -m json.tool /opt/wakari/wakari-compute/etc/wakari/*.json</span>
    </pre></div>
    </div>
    <p>If the file is correct, the contents will be displayed. If there is a
    syntax error in the file, the message
    <code class="docutils literal"><span class="pre">No</span> <span class="pre">JSON</span> <span class="pre">object</span> <span class="pre">could</span> <span class="pre">be</span> <span class="pre">decoded</span></code> will be displayed instead. Edit the
    configuration file, ensuring correct JSON syntax.</p>
    </div>
    <div class="section" id="check-file-ownership">
    <h3>Check file ownership<a class="headerlink" href="#check-file-ownership" title="Permalink to this headline">¶</a></h3>
    <p>Verify that all files in /opt/wakari/anaconda belong to user/group
    <code class="docutils literal"><span class="pre">wakari</span></code>:</p>
    <div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">root</span><span class="nd">@server</span> <span class="c1"># find /opt/wakari/anaconda \! -user wakari -print</span>
    <span class="n">root</span><span class="nd">@server</span> <span class="c1"># find /opt/wakari/anaconda \! -group wakari -print</span>
    </pre></div>
    </div>
    <p>If any files are listed in the output, fix their ownership:</p>
    <div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">chown</span> <span class="o">-</span><span class="n">R</span> <span class="n">wakari</span><span class="p">:</span><span class="n">wakari</span> <span class="o">/</span><span class="n">opt</span><span class="o">/</span><span class="n">wakari</span><span class="o">/</span><span class="n">anaconda</span>
    </pre></div>
    </div>
    </div>
    <div class="section" id="verify-that-posix-acls-are-enabled">
    <h3>Verify that POSIX ACLs are enabled<a class="headerlink" href="#verify-that-posix-acls-are-enabled" title="Permalink to this headline">¶</a></h3>
    <p>The <code class="docutils literal"><span class="pre">acl</span></code> option must be enabled on the filesystem containing the
    project root directory.</p>
    <p>First, determine the project root directory. If a custom projectRoot is
    configured, you can determine it with:</p>
    <div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">root</span><span class="nd">@compute</span> <span class="c1"># grep projectRoot /opt/wakari/wakari-compute/etc/wakari/config.json</span>
    </pre></div>
    </div>
    <p>If not, the project root is <code class="docutils literal"><span class="pre">/projects</span></code>.</p>
    <p>Either the <code class="docutils literal"><span class="pre">mount</span></code> options or default options listed by <code class="docutils literal"><span class="pre">tune2fs</span></code>
    should indicate the <code class="docutils literal"><span class="pre">acl</span></code> option is enabled.</p>
    <div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">root</span><span class="nd">@compute</span> <span class="c1"># fs=`df /projects | tail -1 | cut -d &quot; &quot; -f 1`</span>
    <span class="n">root</span><span class="nd">@compute</span> <span class="c1"># mount | grep $fs</span>
    <span class="o">/</span><span class="n">dev</span><span class="o">/</span><span class="n">vda</span> <span class="n">on</span> <span class="o">/</span> <span class="nb">type</span> <span class="n">ext4</span> <span class="p">(</span><span class="n">rw</span><span class="p">)</span>
    <span class="n">root</span><span class="nd">@compute</span> <span class="c1"># tune2fs -l $fs | grep options</span>
    <span class="n">Default</span> <span class="n">mount</span> <span class="n">options</span><span class="p">:</span>    <span class="n">user_xattr</span> <span class="n">acl</span>
    </pre></div>
    </div>
    </div>
    <div class="section" id="clear-browser-cookies">
    <h3>Clear Browser Cookies<a class="headerlink" href="#clear-browser-cookies" title="Permalink to this headline">¶</a></h3>
    <p>When the Anaconda Enterprise Notebooks configuration changes, or the software is upgraded,
    cookies remaining in the browser can cause issues. Clearing cookies and
    logging in again can help to resolve problems.</p>
    </div>
    </div>
    <div class="section" id="specific-problems">
    <h2>Specific Problems<a class="headerlink" href="#specific-problems" title="Permalink to this headline">¶</a></h2>
    <table border="1" class="docutils">
    <colgroup>
    <col width="22%" />
    <col width="11%" />
    <col width="67%" />
    </colgroup>
    <thead valign="bottom">
    <tr class="row-odd"><th class="head">Problem</th>
    <th class="head">Cause</th>
    <th class="head">Solution</th>
    </tr>
    </thead>
    <tbody valign="top">
    <tr class="row-even"><td>Browser indicates &#8220;too many redirects&#8221;</td>
    <td>Cookies are out of date</td>
    <td>Clear your browser&#8217;s cookies and cache, then try again.</td>
    </tr>
    <tr class="row-odd"><td>supervisorctl error: &#8220;unix:////opt/wakari/wakari-server/etc/supervisor.sock no such file&#8221;</td>
    <td>&#8220;supervisord&#8221; is not running on the Server</td>
    <td>Ensure that supervisord is included in the crontab, as described above. Then start supervisord manually.</td>
    </tr>
    <tr class="row-even"><td>Data Center Not Found message when deleting a project</td>
    <td>Datacenter has already been removed</td>
    <td>As root, run <code class="docutils literal"><span class="pre">/opt/wakari/wakari-server/bin/wk-server-admin</span> <span class="pre">remove-project</span> <span class="pre">--db-only</span> <span class="pre">&lt;user&gt;</span> <span class="pre">&lt;project&gt;</span></code></td>
    </tr>
    <tr class="row-odd"><td>Forgotten administrator password</td>
    <td>&nbsp;</td>
    <td>Use ssh to log into the server as root, and run the command <code class="docutils literal"><span class="pre">/opt/wakari/wakari-server/bin/wk-server-admin</span> <span class="pre">add-user</span> <span class="pre">wakari</span> <span class="pre">--admin</span> <span class="pre">-p</span> <span class="pre">&lt;new</span> <span class="pre">password&gt;</span> <span class="pre">-e</span> <span class="pre">&lt;your</span> <span class="pre">email&gt;</span></code>. You can then log into Anaconda Enterprise Notebooks as the <code class="docutils literal"><span class="pre">wakari</span></code> user with the new password you chose.</td>
    </tr>
    </tbody>
    </table>
    </div>
    <div class="section" id="logs">
    <h2>Logs<a class="headerlink" href="#logs" title="Permalink to this headline">¶</a></h2>
    <p>The locations of the Anaconda Enterprise Notebooks log files for each process
    and application are shown in
    the <a class="reference internal" href="#aen-troubleshooting-normal-operation"><span class="std std-ref">tables above</span></a>.</p>
    <p>The Anaconda Enterprise Notebooks installers log into
    /tmp/wakari_{server,gateway,compute}.log.</p>
    <p>If log files grow too large they can be deleted. To set the logs to be more or
    less verbose, the Jupyter Notebook system <a class="reference external" href="http://jupyter-notebook.readthedocs.io/en/latest/config.html">has a setting &#8216;Application.log_level&#8217;</a>.
    Setting &#8216;Application.log_level&#8217; to &#8216;ERROR&#8217; will make the logs less verbose than
    the default but still fairly informative.</p>
    </div>
    <div class="section" id="killed-supervisord-and-error-this-socket-is-closed">
    <h2>Killed supervisord and &#8220;Error: This socket is closed.&#8221;<a class="headerlink" href="#killed-supervisord-and-error-this-socket-is-closed" title="Permalink to this headline">¶</a></h2>
    <p>When the supervisor daemon &#8220;supervisord&#8221; is killed, information sent to standard output &#8220;stdout&#8221; and standard error &#8220;stderr&#8221; is held in a pipe which eventually fills up. Then attempting to start any app fails with an error message saying &#8220;This socket is closed.&#8221;</p>
    <p>To prevent this problem, always shut down and restart the processes cleanly and do not shut down or kill supervisord without first shutting down wk-compute and other processes that use it.</p>
    <p>To recover from this problem, shut down the process &#8220;wk-compute&#8221; with <code class="docutils literal"><span class="pre">sudo</span> <span class="pre">kill</span> <span class="pre">-9</span></code>. Then restart the supervisord and wk-compute processes:</p>
    <div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">sudo</span> <span class="o">/</span><span class="n">etc</span><span class="o">/</span><span class="n">init</span><span class="o">.</span><span class="n">d</span><span class="o">/</span><span class="n">wakari</span><span class="o">-</span><span class="n">compute</span> <span class="n">stop</span>
    <span class="n">sudo</span> <span class="o">/</span><span class="n">etc</span><span class="o">/</span><span class="n">init</span><span class="o">.</span><span class="n">d</span><span class="o">/</span><span class="n">wakari</span><span class="o">-</span><span class="n">compute</span> <span class="n">start</span>
    </pre></div>
    </div>
    </div>
    <div class="section" id="service-error-502-can-not-connect-to-the-application-manager">
    <h2>Service Error 502: Can not connect to the application manager<a class="headerlink" href="#service-error-502-can-not-connect-to-the-application-manager" title="Permalink to this headline">¶</a></h2>
    <p>When a gateway node shows this error it means that a compute resource is not responding.</p>
    <p>This error is caused when the process &#8220;wk-compute&#8221; has been shut down. To recover from this problem, restart the supervisord and wk-compute processes:</p>
    <div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">sudo</span> <span class="o">/</span><span class="n">etc</span><span class="o">/</span><span class="n">init</span><span class="o">.</span><span class="n">d</span><span class="o">/</span><span class="n">wakari</span><span class="o">-</span><span class="n">compute</span> <span class="n">stop</span>
    <span class="n">sudo</span> <span class="o">/</span><span class="n">etc</span><span class="o">/</span><span class="n">init</span><span class="o">.</span><span class="n">d</span><span class="o">/</span><span class="n">wakari</span><span class="o">-</span><span class="n">compute</span> <span class="n">start</span>
    </pre></div>
    </div>
    </div>
    <div class="section" id="communication-error-on-amazon-web-services">
    <h2>&#8220;502 Communication Error&#8221; on Amazon Web Services<a class="headerlink" href="#communication-error-on-amazon-web-services" title="Permalink to this headline">¶</a></h2>
    <p>If you see a page showing &#8220;502 Communication Error: This gateway could not communicate with the Wakari server&#8221; and the IP address of the Wakari server, configure the AEN gateway to use the DNS hostname of the server. On Amazon Web Services (AWS) this will be the DNS hostname of the Amazon Elastic Compute Cloud (EC2) instance.</p>
    </div>
    <div class="section" id="invalid-usernames">
    <h2>Invalid usernames<a class="headerlink" href="#invalid-usernames" title="Permalink to this headline">¶</a></h2>
    <p>The first character of a username must be a letter [a-z] or a digit [0-9].</p>
    <p>Each other character in a username may be a letter [a-z], a digit [0-9], a period [.], an underscore [_], or a hyphen [-].</p>
    <p>The POSIX standard <a class="reference external" href="http://serverfault.com/a/578264/117528">specifies</a> that these characters are the portable filename character set, and that portable usernames have the same character set.</p>
    <p>An Anaconda Enterprise Notebooks username should be at least 3 characters and no more than 25 characters.</p>
    </div>
