Similar to Pandas & Jupyter Notebook in Python
In this tutorial, we will set up an environment to make data analysis with TypeScript and write our code in an interactive environment like Jupyter Notebook.
We will danfo.js which is built on top of Tensorflow and has a similar API like Pandas and tslab to make TypeScript available in Jupyter Notebook.
Prerequisites
Install tslab
First, install tslab
with npm
.
npm install -g tslab
Please make sure tslab
command is available in your terminal.
tslab install --version
Then, register tslab
to your Jupyter environment.
tslab install --python=python3
Install dependencies
Switch into a folder where you want to run your Typescript Notebook
cd your-folder
Install two packages to use them in your tslab environment
npm init -ynpm i danfojs-node tslab-plotly
Run tslab Notebook
jupyter lab
Run an example
Initialize a simple DataFrame to check if danfo.js is working.
import * as dfd from "danfojs-node"let data = {
"Name": ["Bear", "Wulf", "Lion", "Bird"],
"Size": [21, 5, 30, 10],
"Weight": [200, 300, 40, 250]
}
let df = new dfd.DataFrame(data, { index: ["a", "b", "c", "d"] })
df.print()
Plot random numbers to check if Plotly is working.
import Plotly from "tslab-plotly";
import * as tslab from "tslab";Plotly.newPlot(tslab, [
{
x: [1, 2, 3, 4, 5],
y: [1, 2, 4, 8, 16],
},
]);
Basic Usage of danfo.js for data analysis
Above you saw how to initialize a basic DataFrame.
Here is an example of how to initialize a basic Series object.
const dogNames = ['Lucy', 'Bello', 'Molly', 'Buddy'];const s = new dfd.Series(dogNames, {columns: ['DogNames']})
s.print()
Read a CSV file and load it into a DataFrame.
dfd.read_csv('./path-to/your.csv', {columnConfigs: {DogNames: true}, configuredColumnsOnly: true})
.then(df => {
df.head().print()
df.tail().print()
const s = new dfd.Series(df.values, {columns: ['DogNames']})
s.print()console.log(s.dtype)
console.log(s.shape)
}).catch(error => {
console.log(error)
})
In the above example, I added some more for analyzing the data.
I specified columnsConfigs and configuredColumnsOnly that means all other columns than DogNames will be ignored.
With this command, we will print the first 5 rows of the DataFrame.
df.head().print()// custom count, prints first 10
df.head(10).print()
With this command, we will print the last 5 rows of the DataFrame.
df.tail().print()
This command will print the data type of our Series object.
console.log(s.dtype)
This command will print the shape of our Series object.
console.log(s.shape)
My conclusion is that it’s very nice that JavaScript / TypeScript developers have an opportunity with this setup to do the data analysis in their language, but danfo.js is at the moment of writing not as capable as Pandas. For example date and time handling is much more versatile in Pandas than in danfo.js.
I hope this will change in the near future and JavaScript / TypeScript developers will have the same capabilities in danfo.js.
There is an open Pull request with a set of new features and will expand the capabilities of danfo.js here.
Further reading:
Thanks for reading, your feedback is welcome.