Deep Tech Point
first stop in your tech adventure
Home / Data science
June 13, 2022 | Data science

Word clouds are becoming increasingly popular in data science and analytics. They are successors of tag clouds or better said younger siblings. The main difference is that tag clouds are usually made of tags manually written by user(s) or chosen from a predefined list and word clouds are based on some advanced text analytics mostly performed by software algorithms. Word clouds are used as a tool to extract essence of some text corpus and to represent it in a graphic form. Words with the higher frequency are shown bigger and those with lower frequency smaller or even excluded. This is the basic concept but the algorithm is quite a bit more complex. But is it possible to apply the word cloud algorithm on custom text without coding?

June 5, 2022 | Data science

There are different methods for website data scrapping and importing it into Excel. In this tutorial we will show you how to do it with Puppeteer, Node JS package. The method using Puppeteer might look scary at first but it’s actually quite easy and what is the most important you can get anything you want from the web page very accurately. Knowing JavaScript is useful but interestingly it’s something you can learn quickly up to the point to be able to scrape data.

June 1, 2022 | Data science

There are more and more arm64 CPUs(Central Processing Unit) on the market in different hardware configurations able to run Ubuntu Linux and to run serious processing tasks. Even some cloud services providers included arm64 processors to their offer. One of the biggest names in the cloud industry, Oracle, offers interesting always free virtual machine instances based on Ampere Arm chip. With these virtual machines, besides Always free plan, you get better scaling performance meaning more controlled price performance ratio than with x86 architecture CPUs. It’s also nice to know that cloud company won’t bill your credit card unless you explicitly change your pricing plan which is not the always the case in the cloud industry. Installing Puppeteer on arm64 processors is still bit more complicated than on x86 CPUs, so to make it as straight forward as possible read on.

May 28, 2022 | Data science

There are many ways how to scrape data from a website. You can do it with almost any programming language out there but with variable success. Nowadays it’s bit harder to be successful in website data scraping because many websites use advanced web technologies, progressive web and what not. In other words it’s not just parsing static HTML but getting access to DOM(Document Object Model) of webpage because it’s usually interactively modified meaning when you interact with webpage new parts of HTML are added and some parts removed on the fly.

April 20, 2022 | Data science

In this article, we are going to take a look at logistic or logit regression, and we are going to learn the differences between binary, multinomial, and ordinal logistic regression.

April 7, 2022 | Data science

In this article, we are going to investigate what are outliers in linear regression, why and when are they important, and what should we do about them – should we remove them from the data presentation or not.

April 4, 2022 | Data science

Linear regression is also known as ordinary least squares (OLS) and linear least squares, and it opens the doors into the regression world. Linear regression is one of the most widely known modeling techniques and is usually among the first few topics that people master when they learn predictive modeling. We differentiate between a simple and multiple linear regression, and in this article, we’re going to focus on these two.

March 31, 2022 | Data science

In this article, we are going to learn about one of the most significant predictive analytics tools for machine learning and big data – regression. We are going to define it, learn why and in which cases we use it. We are also going to take a look at seven types of regression analysis – we are going to learn which variables are correlated with specific regression techniques and we are also going to discuss some of the key factors associated with each technique.

March 22, 2022 | Data science

In this article, we are going to learn a bit more about a popular method of creating and visualizing predictive models and algorithms – decision trees. We are going to learn what are decision trees, what are the types of decision trees and when you should use each. Finally, at the end of the article, we will take a look at the advantages as well as disadvantages of using decision trees.

March 12, 2022 | Data science

This article will take you into the world of predictive analysis. We will learn why is important and what are its benefits. We will take a look at a few examples of businesses that use it, and most importantly we will explore the three common types of predictive analytical models used in predictive analytics – decision trees, regression, and neural networks. In addition to that, we will take a look at predictive analytics tools that are powered by even more models, such as classification models, clustering, forecast, outliers, and time-series models among many, as well as and 5 common predictive analytics algorithms that can be applied to a wide range of use cases.