Setup Data Analysis Environment with Typescript

3 min readSep 10, 2021

Similar to Pandas & Jupyter Notebook in Python

In this tutorial, we will set up an environment to make data analysis with TypeScript and write our code in an interactive environment like Jupyter Notebook.

We will danfo.js which is built on top of Tensorflow and has a similar API like Pandas and tslab to make TypeScript available in Jupyter Notebook.

Prerequisites

Install Node.js (LTS or Current)
Install Anaconda

Install tslab

First, install tslab with npm.

npm install -g tslab

Please make sure tslab command is available in your terminal.

tslab install --version

Then, register tslab to your Jupyter environment.

tslab install --python=python3

Install dependencies

Switch into a folder where you want to run your Typescript Notebook

cd your-folder

Install two packages to use them in your tslab environment

npm init -ynpm i danfojs-node tslab-plotly

Run tslab Notebook

jupyter lab

Run an example

Initialize a simple DataFrame to check if danfo.js is working.

import * as dfd from "danfojs-node"let data = {
    "Name": ["Bear", "Wulf", "Lion", "Bird"],
    "Size": [21, 5, 30, 10],
    "Weight": [200, 300, 40, 250]
}
let df = new dfd.DataFrame(data, { index: ["a", "b", "c", "d"] })
df.print()

Plot random numbers to check if Plotly is working.

import Plotly from "tslab-plotly";
import * as tslab from "tslab";Plotly.newPlot(tslab, [
  {
    x: [1, 2, 3, 4, 5],
    y: [1, 2, 4, 8, 16],
  },
]);

Basic Usage of danfo.js for data analysis

Above you saw how to initialize a basic DataFrame.

Here is an example of how to initialize a basic Series object.

const dogNames = ['Lucy', 'Bello', 'Molly', 'Buddy'];const s = new dfd.Series(dogNames, {columns: ['DogNames']})
s.print()

Read a CSV file and load it into a DataFrame.

dfd.read_csv('./path-to/your.csv', {columnConfigs: {DogNames: true}, configuredColumnsOnly: true})
.then(df => {
    df.head().print()
    df.tail().print()
    
    const s = new dfd.Series(df.values, {columns: ['DogNames']})
    s.print()console.log(s.dtype)
    console.log(s.shape)
}).catch(error => {
    console.log(error)
})

In the above example, I added some more for analyzing the data.

I specified columnsConfigs and configuredColumnsOnly that means all other columns than DogNames will be ignored.

With this command, we will print the first 5 rows of the DataFrame.

df.head().print()// custom count, prints first 10
df.head(10).print()

With this command, we will print the last 5 rows of the DataFrame.

df.tail().print()

This command will print the data type of our Series object.

console.log(s.dtype)

This command will print the shape of our Series object.

console.log(s.shape)

My conclusion is that it’s very nice that JavaScript / TypeScript developers have an opportunity with this setup to do the data analysis in their language, but danfo.js is at the moment of writing not as capable as Pandas. For example date and time handling is much more versatile in Pandas than in danfo.js.

I hope this will change in the near future and JavaScript / TypeScript developers will have the same capabilities in danfo.js.

There is an open Pull request with a set of new features and will expand the capabilities of danfo.js here.