O que vai e vem

Foi no dia 20 de agosto de 2011. Um dia antes do meu aniversário de 11 anos e um dia depois do meu avô ter falecido. Foi minha primeira grande perda. Obviamente, toda minha família estava meio sem…

Smartphone

独家优惠奖金 100% 高达 1 BTC + 180 免费旋转




Day 4 of the 66 Days of Data

66 Days of Data

Documenting my Data Science Learning Journey

Today, I’ve continued preparation of the introductory lecture on coding and data science.

One of the great reason for coding in the Python language is the vast collection of libraries that are made available in the Python ecosystem. As a versatile and general purpose language, Python can be used by a broad audience from hobbyists, scientists, engineers, business and financial analysts as well as social scientists.

In the life science, there is a vast array of libraries for handling and making sense of biological and chemical datasets. Particularly, the BioPython library allows the analysis of protein sequences

Relayed some great introductory concepts on data sources from the book and summarized in the following slide. As we can see, these raw data starts out as unstructured data and in order to make any meaningful analysis, we will have to pre-process the data such that it becomes structured.

This structured data will take the form of a tabular data that is essentially an M×N matrix where we have M rows and N columns.

In the slide below, we can summarize the common data types as follows.

Add a comment

Related posts:

Breaking Out of Survival Mode

Those two years have consisted of a lot of highs and lows — goodness, so many lows. Going to school full-time while trying to maintain a nearly full-time job is no joke. I have no idea how people are…