Introducing Pandas-Sets: Set-oriented Operations in Pandas

I frequently find myself storing standard Python set objects in DataFrame columns. This usually happens when I have some kind of a tags or labels column for each observation. It can also be the output of a groupby operation where the end result needs to be a list-like (or set-like) object before it's aggregated. Using set operations (union, intersection etc.) can come in handy in … Continue reading Introducing Pandas-Sets: Set-oriented Operations in Pandas

Deep Learning Resources

Online Courses Andrew Ng’s Machine-Learning Class on Coursera Geoff Hinton’s Neural Networks Class on Coursera (2012) U. Toronto: Introduction to Neural Networks (2015) Yann LeCun’s NYU Couse Ng’s Lecture Notes for Stanford’s CS229 Machine Learning Nando de Freitas’s Deep Learning Class at Oxford (2015) Andrej Karpathy’s Convolutional Neural Networks Class at Stanford Patrick Winston’s Introduction … Continue reading Deep Learning Resources

Summarize whole paragraph to sentence by Extractive Approach​

To catch a quick idea of a long document, we will always to do a summarization when we read an article or book. In English, the first (or first two) sentence(s) of each article has a very high chance of representing the whole article. Of course, the topic sentence can be the last sentence in … Continue reading Summarize whole paragraph to sentence by Extractive Approach​

Docker in a Nutshell

I want to start to tackle two very important questions that we are going to be answering throughout this blog post. The two important questions are: What is Docker? Why do we use Docker? Let’s answer first Why we do use Docker by going through a quick little demo right now. Let’s have a look at this … Continue reading Docker in a Nutshell

Introduction to Natural Language Processing with NLTK

What is Natural Language Processing? Natural Language Processing (NLP) helps computers (machines) "read and understand" text or speech by simulating human language abilities. However, in recent years, NLP has grown rapidly because of an abundance of data. Given that more and more unstructured data is available, NLP has gained immense popularity. Prerequisites  Python 3.+ Jupyter Notebook Natural … Continue reading Introduction to Natural Language Processing with NLTK

Introduction to Kubernetes

Introduction  Kubernetes is a powerful open-source system, initially developed by Google, for managing containerized applications in a clustered environment. It aims to provide better ways of managing related, distributed components and services across varied infrastructure. In this article, we'll discuss some of Kubernetes' basic concepts. We will talk about the architecture of the system, the … Continue reading Introduction to Kubernetes

Java 10 Features

After Java 9 release, Java 10 came very quickly. Unlike it’s previous release, Java 10 does not have many exciting features, still it has few important updates which will change the way you code. Novelties’ in Java 10 Local-variable type Inference Root Certificates for OpenJDK Change in Java garbage collecting Garbage collector interface Experimental Java-based JIT … Continue reading Java 10 Features

Spark study notes: core concepts visualized

Learning Spark is not an easy thing for a person with less background knowledge on distributed systems. Even though I have been using Spark for quite some time, I find it time-consuming to get a comprehensive grasp of all the core concepts in Spark. The official Spark documentation provides a very detailed explanation, yet it focuses more … Continue reading Spark study notes: core concepts visualized

The basis of Azure Data Factory

In the world of big data, raw, unorganized data is often stored in relational, non-relational, and other storage systems. However, on its own, raw data doesn’t have the proper context or meaning to provide meaningful insights to analysts, data scientists, or business decision makers. Big data requires service that can orchestrate and operationalize processes to … Continue reading The basis of Azure Data Factory