Mar 28,2021

  AWS ML Community showcase: March 2021 edition.  9 Comprehensive Cheat Sheets For Data Science.  Fine-tuning pretrained NLP models with Huggingface’s Trainer.  Using TFRecords to Train a CNN on MNIST.  How Zebra Medical Vision Developed Clinical AI Solutions.  ?Two-layered Recommender System Methodology: A Prize-Winning Solution.  iPad Pro + Raspberry Pi for Data Science Part 4: Installing Kubernetes for Learning Purposes.  Three More Ways To Make Fictional Data.  Object Extraction From Images.  Deep Learning for Semantic Text Matching.    How I Taught My Air Conditioner Some Hebrew.  Two-Dimensional (2D) Test Functions for Function Optimization.  Google AI Blog: Constructing Transformers For Longer Sequences with Sparse Attention Methods.  GPT-3 Powers the Next Generation of Apps.  Introducing Amazon Lookout for Metrics: An anomaly detection service to proactively monitor the health of your business.  Configure Amazon Forecast for a multi-tenant SaaS application.  Evaluation Bias: Are you inadvertently training on your entire dataset?.  Vital Signs: Assessing Data Health and Dealing with Outliers.  Forecasting Climate Change in Italy with Long Short Term Memory Networks.  Building a Naive Bayes Machine Learning Model to Classify Text.  
News Blog Paper China
AWS ML Community showcase: March 2021 edition
       
In our Community Showcase, Amazon Web Services (AWS) highlights projects created by AWS Heroes and AWS Community Builders. Here are a few highlights of externally published getting started guides and tutorials curated by our AWS ML Evangelist team led by Julien Simon. AWS ML Heroes and AWS ML Community Builder ProjectsMaking My Toddler’s Dream of Flying Come True with AI Tech (with code samples). Choose from community-created and ML-focused blogs, videos, eLearning guides, and much more from the AWS ML community. About the AuthorCameron Peron is Senior Marketing Manager for AWS Amazon Rekognition and the AWS AI/ML community.
9 Comprehensive Cheat Sheets For Data Science
       
Data science is one of those tech fields that has exploded in popularity and resources in recent years. This comprehensive 10-page sheet cheat covers all the core basics of probability theory containing a semester worth of materials. №3: SQLYou can’t spell data science without “data” after all; data scientists try to figure out the story that their data is trying to tell and then use this story to make predictions on new data. Stanford University has created a comprehensive machine learning cheat sheet that contains sub-cheat sheets for supervised learning, unsupervised learning, model metrics, and deep learning. That’s why the last cheat sheet on this list is a Jupyter Notebook cheat sheet.
Fine-tuning pretrained NLP models with Huggingface’s Trainer
       
There are many pretrained models which we can use to train our sentiment analysis model, let us use pretrained BERT as an example. You can search for more pretrained model to use from Huggingface Models page. model_name = "bert-base-uncased"tokenizer = BertTokenizer.from_pretrained(model_name)model = BertForSequenceClassification.from_pretrained(model_name, num_labels=2)Since we are using a pretrained model, we need to ensure that the input data is in the same form as what the pretrained model was trained on. Step 2: Preprocess text using pretrained tokenizerX_train_tokenized = tokenizer(X_train, padding=True, truncation=True, max_length=512)X_val_tokenized = tokenizer(X_val, padding=True, truncation=True, max_length=512)Let us preprocess the text using the tokenizer intialised earlier. Next, we specify some training parameters, set the pretrained model, train data and evaluation data in the TrainingArgs and Trainer class.
Using TFRecords to Train a CNN on MNIST
       
Using TFRecords to Train a CNN on MNISTWhen I started with TFRecords, I took me a while to understand the concept behind it. The first option is of use when we create our TFRecord dataset, the second option allows slightly more comfortable iterating. Afterwards, we shuffle our data, set a batch size, and set repeat with no argument; this means to repeat it endlessly. To recap until here:We created two TFRecord files, one for the training data, one for the test data. With it, we created two TFRecord files, one for the training data, and one for the testing data.
How Zebra Medical Vision Developed Clinical AI Solutions
       
That’s why, for the first two years, Zebra Medical Vision hardly did any machine learning at all. They dubbed this role the “Clinical Information Manager” — and this person usally has a PhD in biomedical engineering or clinical research. With all this infrastructure and this team in place, Zebra Medical Vision can move at a dazzling pace. The more practical approach: Zebra Medical Vision realized that patients get CT scans for many other diseases. So Zebra Medical Vision built an algorithm that locates and identifies these compression factors.
?Two-layered Recommender System Methodology: A Prize-Winning Solution
       
?Two-layered Recommender System Methodology: A Prize-Winning SolutionPhoto by Denise Jans on UnsplashA Cinema Challenge hackathon was held from 14 to 22 November of 2020. Participants could decide on one of three projects:Challenge 1 — film recommender system; Challenge 2 — tv-program recommender system; Projects Contest. TV-program was considered watched if the user watched over 80% of it and did not change the channel. Solution accuracy was asserted using Kaggle. Top features — user watch time, user features from LightFM, schedule change.
iPad Pro + Raspberry Pi for Data Science Part 4: Installing Kubernetes for Learning Purposes
       
Raspberry Pi + iPad ProiPad Pro + Raspberry Pi for Data Science Part 4: Installing Kubernetes for Learning PurposesHello there friends! We’re back again with a fourth part in our series for enabling a Raspberry Pi to work directly with an iPad Pro. Minikube has been awesome, but it unfortunately doesn’t work with Raspberry Pi. This is because Raspberry Pi uses an ARM-based CPU architecture, and there unfortunately isn’t a flavor of Minikube that currently supports that. Unique to this deployment of Kubernetes, K3s makes use of a specific k3s.yaml file for configuration settings.
Three More Ways To Make Fictional Data
       
Three More Ways To Make Fictional DataSince writing about this topic earlier, a handful of folks throughout the community have shared with me their own picks for tools that generate fictional data. Evaluation — A reminder, I’m evaluating each tool based on its ability to replicate the results from an earlier article “How To Make Fictional Data.” Faker gets very close. To get a longitude and latitude in the desired range (location), I used the generic number range data type. I also used the generic number range data types for weight and wing-span. Mockaroo lets you generate up to 1,000 rows of realistic test data in CSV, JSON, SQL, and Excel formats.”Mockaroo boasts 145 data types.
Object Extraction From Images
       
Object Extraction Using SkimageLet’s say you have an image as below, which is exactly the same as the one above, except that I manually added a “white stain” in the middle. Smart guys may already notice that, with a “level” value of 150, we get more pixels contained within a contour. Geniuses may already discover that the contours essentially separate the pixel values that are smaller than the “level” value and the pixel values that are larger. A Side NoteSometimes, your image might be too sharp, meaning that the pixel values may have large variations within the object. If the pixel values vary across the “level” value, multiple contours can be detected inside the object.
Deep Learning for Semantic Text Matching
       
Candidate Generation:Inverted Index based candidate generation:Traditionally token based inverted index is used for candidate generation. Union of these individual lists with BM25 scoring can be used as candidate documents for the next step of reranking. Inverted Index based candidate generation (Source: Author Image)But such token based retrieval has limitations as it does not capture semantic relation between words. Embedding based candidate generation:Recently candidate generation using DNN embeddings has become popular, as they can better capture the query and document semantics. Word2vec CBoW and Skip-gram are the two early word embedding models that generated a lot of interest around dense word embeddings.
How I Taught My Air Conditioner Some Hebrew
       
So I was thinking to myself: “What if I just taught that air conditioner to recognize my speech”? Sensibo IoT sensors let you toggle anything to do with your air conditioner from your phone. All of the above meant that I only needed to write my own “button clicking” app, which instead of waiting for my finger tap would listen to my voice to toggle the air conditioner. It turns out that a really big percentage of speech recognition models listen to the user speaking and try to classify the speech as whole words. If the user waits for too long before the air conditioner does anything, he’ll just park his car and click the normal button.
Two-Dimensional (2D) Test Functions for Function Optimization
       
Two-dimensional functions take two input values (x and y) and output a single evaluation of the input. In this tutorial, you will discover standard two-dimensional functions you can use when studying function optimization. Tutorial OverviewA two-dimensional function is a function that takes two input variables and computes the objective value. Nevertheless, there are standard test functions that are commonly used in the field of function optimization. We will explore a small number of simple two-dimensional test functions in this tutorial and organize them by their properties with two different groups; they are:Unimodal Functions Unimodal Function 1 Unimodal Function 2 Unimodal Function 3 Multimodal Functions Multimodal Function 1 Multimodal Function 2 Multimodal Function 3Each function will be presented using Python code with a function implementation of the target objective function and a sampling of the function that is shown as a surface plot.
Google AI Blog: Constructing Transformers For Longer Sequences with Sparse Attention Methods
       
Moreover, we also show that theoretically our proposed sparse attention mechanism preserves the expressivity and flexibility of the quadratic full Transformers. To achieve structured sparsification of self attention, we developed the global-local attention mechanism. In the BigBird paper, we explain why sparse attention is sufficient to approximate quadratic attention, partially explaining why ETC was successful. Behind both ETC and BigBird, one of our key innovations is to make an efficient implementation of the sparse attention mechanism. ConclusionWe show that carefully designed sparse attention can be as expressive and flexible as the original full attention model.
GPT-3 Powers the Next Generation of Apps
       
Given any text prompt like a phrase or a sentence, GPT-3 returns a text completion in natural language. Applications and industriesTo date, over 300 apps are using GPT-3 across varying categories and industries, from productivity and education to creativity and games. Using GPT-3, Viable identifies themes, emotions, and sentiment from surveys, help desk tickets, live chat logs, reviews, and more. Algolia Answers helps publishers and customer support help desks query in natural language and surface nontrivial answers. With natural language processing, technical experience is no longer a barrier, and we can truly keep our focus on solving real world problems.
Introducing Amazon Lookout for Metrics: An anomaly detection service to proactively monitor the health of your business
       
You can connect Lookout for Metrics to 19 popular data sources, including Amazon Simple Storage Solution (Amazon S3), Amazon CloudWatch, Amazon Relational Database Service (Amazon RDS), and Amazon Redshift, as well as software as a service (SaaS) applications like Salesforce, Marketo, and Zendesk, to continuously monitor metrics important to your business. Solution overviewThis post demonstrates how you can set up anomaly detection on a sample ecommerce dataset using Lookout for Metrics. For Datasource, choose Amazon S3. About the AuthorsAnkita Verma is the Product Lead for Amazon Lookout for Metrics. He has a special interest in launching AI services and helped grow and build Amazon Personalize and Amazon Forecast before focusing on Amazon Lookout for Metrics.
Configure Amazon Forecast for a multi-tenant SaaS application
       
Forecast data ingestionForecast imports data from the tenant’s Amazon Simple Storage Service (Amazon S3) bucket to the Forecast managed S3 bucket. For example:s3://tenant_a [ Tag tenant = tenant_a ] s3://tenant_b [ Tag tenant = tenant_b ]There is a hard limit on the number of S3 buckets per account. The tenant tag validation condition in the following code makes sure that the tenant tag value matches the principal’s tenant tag. The tenant tag validation condition in the following code makes sure that the tenant tag value matches the principal’s tenant tag. The tenant tag validation condition in the following code makes sure that the tenant tag value matches the tenant.
Evaluation Bias: Are you inadvertently training on your entire dataset?
       
A good first step is to start using a third validation split for evaluating your training runs. You only use this holdout split for evaluation purposes once you feel that you already have a model that will generalize well based on how it has performed on the validation data. Remember, the underlying reason that we use a 3rd validation split is not to hide the samples from the algorithm. If you have a lot of data, then you can afford to let your validation and test splits eat into your training set. There are latent, unseen features that our model is trying to tease out during training.
Vital Signs: Assessing Data Health and Dealing with Outliers
       
Vital Signs: Assessing Data Health and Dealing with OutliersPhoto by jesse orrico on UnsplashAt the doctor’s office, you and the medical assistant go through a familiar routine before the doctor arrives. The Data Health Tool, now included in the Alteryx Intelligence Suite, does something similar for your data. About the Data Health ToolThe Data Health Tool gathers “vital signs” for your dataset that reveal whether it’s ready to yield robust, accurate insights, or if it would benefit from some special treatment first. The Data Health Tool uses a method established in peer-reviewed research in 2008 (read it here). With the Data Health Tool, once your outliers are identified, it’s up to you how you prefer to proceed with handling them.
Forecasting Climate Change in Italy with Long Short Term Memory Networks
       
IntroductionWe are living a time full of challenges for humanity and one of the biggest challenges is climate change. So, in this time characterized by many changes and difficulties, how important is the role of Data Science in climate change? From my personal point of view, the role of Data Science in climate change will get bigger and bigger in the near future. With algorithms that take into consideration local weather, climate patterns or household behaviour, data scientists can predict how much energy we need in real-time and over the long-term. In agriculture with IoT devices that sense soil moisture and nutrients, in conjunction with weather data, farmers can have better control of the irrigation and fertilizer system.
Building a Naive Bayes Machine Learning Model to Classify Text
       
Building a Naive Bayes Machine Learning Model to Classify TextA quick start guide to get you up and running with an easy yet highly relevant NLP project in Python Aden Haussmann Just now·6 min readNaive Bayes in Python (All images by author)IntroductionNatural Language Processing (NLP) is an extremely exciting field. Bayes’ TheoremNaive BayesOne of the simpler supervised Bayesian Network models, the Naive Bayes algorithm is a probabilistic classifier based on Bayes’ Theorem (which you might remember from high school statistics). Alternatively, you might consider a Deep Learning approach based on neural networks, but this would require far more training data. The solution is to extract features from the text and turn those into vectors that can be understood by the model. I hope you gained a good high level understanding of how Naive Bayes works, and how to implement it for classifying text, specifically.
Demystify Deep Learning Terminologies and Build Your First Neural Network
       
Parts of a neural networkThe image below is a basic representation of a neural network. Layer — This refers to a collection of neurons operating together at a specific depth in a neural network. Deep neural network (DNN) — This is when a neural network contains a deep stack of hidden layers (several of the middle columns). Optimizer — This is a technique for modifying the attributes of the neural network such as the weights and the learning rate so as to reduce the loss. Table by authorWe will feed our neural network with the data and have it determine the relationship between the 2 sets.
Feature Store: Data Platform for Machine Learning
       
Feature Store: Data Platform for Machine LearningFeature data (or simply called, Feature) are critical to the accurate predictions made by Machine Learning (ML) models. In the following, I will briefly survey the leading feature stores in 2 tech companies: Uber and Airbnb, and also an open-source feature store: Feast. Airbnb: ZiplineAirbnb built their Feature store, called Ziplin, at least 4 years back. geohash(4) Feature Quality Monitoring: It is common to see feature pipeline breakages, feature data missing, drifts and inconsistency. My ThoughtsFor generic ML data platform, here are my 3 personal thoughts:(1) One of the most valuable and challenging problems is the data transformation from the raw data into high-quality, ML-friendly feature.
Mouse-free data science. Detect your cat’s prey with a Raspberry…
       
Video by AuthorFor months our two out-door cats carried dead, living and partially living mice into our home at night. The plan to create a smart-er cat flap that prevents cats unwanted gifts turned into an “idée fixe” that received a subdued smile from my wife. However 8 out of 10 detections resulted in the cat flap locking after the cat had already passed through. Photo by AuthorI tried to shorten the delay with a by-pass to the API service of the cat flap. Then I tried to connect my Raspberry to the cat flap via ZigBee.
Predicting political orientation with Machine Learning
       
Predicting political orientation with Machine LearningNote: This is not a political post, and the scientific analysis has been done without any bias. :)In this blog, we will use (traditional) Machine Learning techniques to predict the political orientation of Twitter users’, using Python. I’m a physicist, and I don’t like when Machine Learning is applied as a black box. Vectorization can be performed with more sophisticated Machine Learning techniques, usually involving Deep Learning. We have our 5000 vectors, thus we are just considering a basic Machine Learning classification task.
How I Taught My Air Conditioner Some Hebrew
       
So I was thinking to myself: “What if I just taught that air conditioner to recognize my speech”? Sensibo IoT sensors let you toggle anything to do with your air conditioner from your phone. All of the above meant that I only needed to write my own “button clicking” app, which instead of waiting for my finger tap would listen to my voice to toggle the air conditioner. It turns out that a really big percentage of speech recognition models listen to the user speaking and try to classify the speech as whole words. If the user waits for too long before the air conditioner does anything, he’ll just park his car and click the normal button.
Audio Deep Learning Made Simple: Automatic Speech Recognition (ASR), How it Works
       
Load Audio FilesStart with input data that consists of audio files of the spoken speech in an audio format such as “.wav” or “.mp3”. Read the audio data from the file and load it into a 2D Numpy array. Convert to uniform dimensions: sample rate, channels, and durationWe might have a lot of variation in our audio data items. Since our deep learning models expect all our input items to have a similar size, we now perform some data cleaning steps to standardize the dimensions of our audio data. However, as we’ve just seen with deep learning, we required hardly any feature engineering involving knowledge of audio and speech.
Feeding the Beast: The Data Loading Path for Deep Learning Training
       
Feeding the Beast: The Data Loading Path for Deep Learning TrainingOptimize your deep learning training process by understanding and tuning data loading from disk to GPU memory Assaf Pinhasi Mar 16·11 min readPhoto by David Lázaro on UnsplashDeep learning experimentation speed is important for delivering high-quality solutions on time. The data loading path — i.e. 16/32) if you have a small amount of training data, and larger values if you have a lot of training data. In the simple case, transforming the raw input example into a training example is as simple as decoding a .jpg into pixels. Stage 2 — Loading examples from storageIn the majority of cases, I/O is the largest cost in data loading.
Multi-Class Classification With Transformers
       
Building the ModelFirst, we need to initialize our pre-trained BERT model like so:We will be building a frame around BERT using the typical tf.keras layers. There are s few parts to this frame:Two input layers (one for the input IDs, and another for the attention mask). input layers (one for the input IDs, and another for the attention mask). Once this is done, we can freeze the BERT layers to speed up training (at the cost of a likely performance decrease). The reason we freeze BERT parameters is that there a lot of them, and updating these weights will significantly increase training time.
Google AI Blog: Recursive Classification: Replacing Rewards with Examples in RL
       
Top: To teach a robot to hammer a nail into a wall, most reinforcement learning algorithms require that the user define a reward function. Doing so avoids potential bugs and bypasses the process of defining the hyperparameters associated with learning a reward function (such as how often to update the reward function or how to regularize it) and, when debugging, removes the need to examine code related to learning the reward function. Right: In the example-based control approach, the model is provided only with unlabeled experience (grey circles) and success examples (green circles), so one cannot apply standard supervised learning. Instead, the model uses the success examples to automatically label the unlabeled experience. The key difference is that the approach described here does not require a reward function.
AI names colors much as humans do
       
What the research is:Across the thousands of different languages spoken by humans, the way we use words to represent different colors is remarkably consistent. Facebook AI has now shown that cutting-edge AI systems behave similarly. The images on the left show two color-naming systems created entirely by neural networks. How it works:We built two neural networks, a Speaker and a Listener, and tasked them with playing the “communication game” illustrated below. This chart shows color-naming systems created by human languages (shown in blue) and by neural networks (shown in orange).
Amazon Kendra adds new search connectors from AWS Partner, Perficient, to help customers search enterprise content faster
       
Today, Amazon Kendra is making nine new search connectors available in the Amazon Kendra connector library developed by Perficient, an AWS Partner. Improving the Enterprise Search ExperienceThese days, employees and customers expect an intuitive search experience. Perficient Connectors for Amazon KendraPerficient has years of experience developing data source connectors for a wide range of enterprise data sources. To get started with Amazon Kendra, visit the Amazon Kendra Essentials+ workshop for an interactive walkthrough. To learn more about other Amazon Kendra data source connectors visit the Amazon Kendra connector library.
Careers in Machine Learning, Python Music, and AI’s Brain Connection
       
Head straight to Eugene Yan’s invigorating Q&A with Chip Huyen, where Chip shares too many valuable insights to count (about machine learning and getting into Stanford, yes—but also about setting goals and finding community through writing). Photo by Camille Vandoorsselaere on UnsplashThis week’s must-readsMany a data scientist has started out thinking that a machine learning career revolves around mastery and expertise. Of course, networking and business acumen will only get you so far if you can’t produce valuable work, which itself often relies on highly specialized knowledge. Sit back with your snack of choice and treat yourself to Mark Saroufim’s thought-provoking polemic on the current state of machine learning, including an unflinching look at the parts of the field that no longer feel vibrant. She stresses the importance of finding the right learning rhythm, and balancing ambitious goals with realistic expectations.
What is MLOps — Everything You Must Know to Get Started
       
This new requirement of building ML systems adds/reforms some principles of the SDLC to give rise to a new engineering discipline called MLOps. In order to understand MLOps, we must first understand the ML systems lifecycle. Model training and experimentation — data scienceAs soon as your data is prepared, you move on to the next step of training your ML model. You can add version control to all the components of your ML systems (mainly data and models) along with the parameters. Other tasks include:Test a model by writing unit tests for model training.
Analyzing and Interpreting Data From Rating Scales
       
Analyzing and Interpreting Data From Rating ScalesNote: The code for this post can be found hereImprove Customer Rating (image by author)Rating Scales are an effective and popular way to gauge attitudes and opinions. The goal of this 2-part series is to demonstrate basic concepts needed to effectively utilize Rating Scales data as well as warn about common pitfalls. Feedback Form QuestionsUnderstanding The Rating ScaleEach Rating Scale is implemented as a closed-ended question to elicit information. thermometer for temperature and ruler for length), Rating Scales can be used to measure properties that are cognitive in nature. One common pitfall with Rating Scales analytics is the assumption that the distance between choices are equal.
How To Use Data (and Psychology)To Get More Data
       
Photo by Joshua Sortino on UnsplashHow To Use Data (and Psychology)To Get More DataGetting people to fill out a survey is an unfortunately complicated business. As we all know, good data, and a good amount of data, is essential to building any working models. There is also attributes that all individuals value which will almost always have a positive effect but I’ll come to that in the next section. You may get just as many, if not more, responses due to individuals’ good nature. Conclusion:Therefore, to sum it all up here’s a summary of my process when looking to survey individuals.
Removing the “Adversarial” in Generative Adversarial Networks
       
To maximize the discriminator’s loss by generating convincing pictures that are indistinguishable from the dataset images. To minimize the discriminator’s loss by classifying real or fake images with high performance. A commonly used algorithm for GANs is gradient descent-ascent, which alternates between two steps. Discriminator performs a gradient ascent step towards maximizing discriminator loss. Other variants of gradient descent-ascent have proposed solutions that address the issue of nonconvergence.
Understand MapReduce Intuitively
       
There are numerous methodologies to increase performance, but the most commonly technique used is known as MapReduce. Worker nodes are assigned numerous jobs to be performed ahead of time, and all the nodes complete their jobs simultaneously. However, it does not show to be any faster than implementing the function itself without MapReduce. Simply put, MapReduce is a procedure utilized to its maximum potential in parallel computing. ConclusionAs the intuition of MapReduce starts to form, it is quite simple to see its utility in Data Science and Machine Machine Learning.
Curious about Variational Autoencoders (VAEs)? Start Here.
       
Curious about Variational Autoencoders (VAEs)? In recent years, GANs (generative adversarial networks) have been all the rage in the field of deep-learning generative models, leaving VAEs in relative obscurity. But there’s much to gain from a solid footing in variational autoencoders, which tackle similar challenges but use a different architectural foundation. If you were looking for an engaging, accessible way to learn more about VAEs, Joseph and Baptiste Rocca’s introduction hits the spot. They define terms, walk us through the various elements that make up VAEs and how they relate to each other, and add beautiful illustrations for all the visual learners out there.
CNNs for Audio Classification
       
CNNs for Audio ClassificationImage by AuthorConvolutional Neural NetsCNNs or convolutional neural nets are a type of deep learning algorithm that does really well at learning images. These properties make CNNs formidable learners for images because the real world doesn’t always look exactly like the training data. The data for this example are bird and frog recordings from the Kaggle competition Rainforest Connection Species Audio Detection. Scale and pad the audio features so that every “channel” is the same size. Lastly, stop iterating when you note a decrease in performance in the validation data in comparison to the training data.
Super Resolution: Adobe Photoshop versus Leading Deep Neural Networks
       
Super Resolution of image from Unsplash by Adobe’s Super Resolution algorithmHow effective is Adobe’s Super Resolution compared to the leading super resolution deep neural network models? There are many positive comments describing how good Adobe Photoshop’s Super Resolution is, such as “Made My Jaw Hit the Floor”. Adobe Camera Raw’s Super ResolutionThe Adobe Camera Raw Super Resolution, or equivalent Photoshop Camera Raw filter is a recent very fast and easy to use Super Resolution method literally possible by clicking “enhance” in Adobe’s products using Camera Raw. In each example the left image is bicubic interpolation upscaling, the centre image is Adobe’s Super Resolution and right image is the IDN deep neural network’s Super Resolution. The visual improvement in resolution and quality with most of the images is very noticeable from Adobe’s Super Resolution, although artifacts are introduced or exaggerated that are not there in the IDN Deep Neural Network Super Resolution.
How to Manually Optimize Machine Learning Model Hyperparameters
       
In this tutorial, you will discover how to manually optimize the hyperparameters of machine learning algorithms. In this section, we will explore how to manually optimize the hyperparameters of the Perceptron model. For example:... # define model model = XGBClassifier() 1 2 3 . # define model model = XGBClassifier ( )Before we tune the hyperparameters of XGBoost, we can establish a baseline in performance using the default hyperparameters. TutorialsAPIsArticlesSummaryIn this tutorial, you discovered how to manually optimize the hyperparameters of machine learning algorithms.
Google AI Blog: Progress and Challenges in Long-Form Open-Domain Question Answering
       
Open-domain long-form question answering (LFQA) is a fundamental challenge in natural language processing (NLP) that involves retrieving documents relevant to a given question and using them to generate an elaborate paragraph-length answer. While there has been remarkable recent progress in factoid open-domain question answering (QA), where a short phrase or entity is enough to answer a question, much less work has been done in the area of long-form question answering. It achieves a new state of the art on ELI5, the only large-scale publicly available dataset for long-form question answering. Our submission tops the KILT leaderboard for long-form question answering on ELI5 with a combined KILT R-L score of 2.36. The follow-up work on open-domain long-form question answering has been a collaboration involving Kalpesh Krishna, Aurko Roy and Mohit Iyyer.
Announcing AWS Media Intelligence Solutions
       
Today, we’re pleased to announce the availability of AWS Media Intelligence (AWS MI) solutions, a combination of services that empower you to easily integrate AI into your media content workflows. AWS MI allows you to analyze your media, improve content engagement rates, reduce operational costs, and increase the lifetime value of media content. With AWS MI, you can choose turnkey solutions from participating AWS Partners or use AWS Solutions to enable rapid prototyping. TripleLift is an AWS Technology Partner that provides a programmatic advertising platform powered by AWS MI. Customers can dramatically reduce the time and cost requirements to produce, distribute and monetize media content at scale with AWS Media Intelligence and its underlying AI services.
AWS and Hugging Face Collaborate to Simplify and Accelerate Adoption of Natural Language Processing Models
       
Thanks to its managed infrastructure and its advanced machine learning capabilities, customers can build and run their machine learning workloads quicker than ever at any scale. As NLP adoption grows, so does the adoption of Hugging Face models, and customers have asked us for a simpler way to train and optimize them on AWS. Working with Hugging Face Models on Amazon SageMakerToday, we’re happy to announce that you can now work with Hugging Face models on Amazon SageMaker. In our business, we use machine learning models to help customers contextualize conversations, remove time-consuming tasks, and deflect repetitive questions. Getting StartedYou can start using Hugging Face models on Amazon SageMaker today, in all AWS Regions where SageMaker is available.
Pytorch Training Tricks and Tips
       
Pytorch Training Tricks and TipsPhoto by ActionVance on UnsplashIn this article, I will describe and show the code for 4 different Pytorch training tricks that I personally have found to improve the training of my deep learning model. Converting all calculations to 16-bit precision in Pytorch is very simple to do and only requires a few lines of code. The most direct way to fix the problem is to reduce your batch size, but suppose that you don’t want to reduce your batch size. If you don’t want to reduce your batch size, you can use gradient accumulation to stimulate your desired batch size. Suppose that your machine/model can only support a batch size of 16 and increasing it results in a CUDA out of memory error, and you want to have a batch size of 32.
Medical Cost Prediction
       
Medical Cost PredictionPhoto by Bermix Studio on UnsplashA health insurance company can only make money if it collects more than it spends on the medical care of its beneficiaries. colSums(is.na(train)) #> age sex bmi children smoker region charges#> 0 0 0 0 0 0 0 colSums(is.na(test)) #> age sex bmi children smoker region charges#> 0 0 0 0 0 0 0Awesome! formula <- as.formula(paste(' ~ .^2 + ',paste('poly(', colnames(X_train), ', 2, raw=TRUE)[, 2]', collapse = ' + ')))formula #> ~.^2 + poly(age, 2, raw = TRUE)[, 2] + poly(bmi, 2, raw = TRUE)[,#> 2] + poly(children, 2, raw = TRUE)[, 2] + poly(smoker, 2,#> raw = TRUE)[, 2]#> Then, insert y_train and y_test back to the new datasets. age x bmi , age x children , age x smoker , bmi x children , bmi x smoker , children x smoker are six interactions between pairs of four features. summary(lm_all) #>#> Call:#> lm(formula = charges ~ age + bmi + children + smoker, data = train)#>#> Residuals:#> Min 1Q Median 3Q Max#> -11734 -2983 -1004 1356 29708#>#> Coefficients:#> Estimate Std.
How to Fine-Tune BERT Transformer with spaCy 3
       
How to Fine-Tune BERT Transformer with spaCy 3Since the seminal paper “Attention is all you need” of Vaswani et al, Transformer models have become by far the state of the art in NLP technology. BERT ArchitectureIn this tutorial, I will show you how to fine-tune a BERT model to predict entities such as skills, diploma, diploma major and experience in software job descriptions. Below is a step-by-step guide on how to fine-tune the BERT model on spaCy 3. Data Labeling:To fine-tune BERT using spaCy 3, we need to provide training and dev data in the spaCy 3 JSON format (see here) which will be then converted to a .spacy binary file. We were able to extract most of the skills, diploma, diploma major, and experience correctly.
Scikit-Learn Cheat Sheet (2021), Python for Data Science
       
Scikit-learn is a free software machine learning library for the Python programming language. It features various classification, regression, clustering algorithms, and efficient tools for data mining and data analysis. It’s built on NumPy, SciPy, and Matplotlib. Basic Example:The code below demonstrates the basic steps of using scikit-learn to create and run a model on a set of data. The steps in the code include: loading the data, splitting into train and test sets, scaling the sets, creating the model, fitting the model on the data, using the trained model to make predictions on the test set, and finally evaluating the performance of the model.
Framework for a successful Continuous Training Strategy
       
Continuous training seeks to automatically and continuously retrain the model to adapt to changes that might occur in the data. Yet, and regardless of the use case, three main questions need to be addressed when designing a continuous training strategy:1 — When should the model be retrained? The three most common strategies are: periodic retraining, performance based or based on data changes. For future ones, a different window may be selected, according to the comparison with the test set. The disadvantage is that it requires a more complex training pipeline (see next question ‘What to train’) to test the different window sizes and select the optimal one and it is much more computing intensive.
How I passed the AWS Certified Machine Learning Specialty
       
How I passed the AWS Certified Machine Learning SpecialtyPhoto David KolbWhy did I start a Machine learning education? This article is specifically about the AWS machine learning speciality, the study path I took and what the exam involves. The AWS Machine learning Speciality was one part of my Machine learning platform-specific goals. In Amazon’s words, the AWS Certified Machine learning — Speciality Certification “validates a candidate’s ability to design, implement, deploy, and maintain machine learning (ML) solutions for given business problems.” To cover all that, the exam is split into four domains. SummaryI enjoyed the journey to the AWS Machine learning speciality.
A practical guide to TFRecords
       
Images are a common domain in deep learning, with MNIST [1] and ImageNet [2] being two well-known datasets. To make loading and parsing image data efficient we can resort to TFRecords as the underlying file format. For each image and corresponding label, we then use the function above to create such an object. This is also possible, and goes the other way:Earlier, we defined a dictionary that we used to write our content to disk. In the last step, we have to parse our image back from a serialized form to the (height, width, channels) layout.
Boost basic Dataset and simple CNN to answer real environment problem
       
We modified our dataset and used Plant Village dataset from Kaggle which is similar to our first Dataset (leaves images on a uniform background), but without any data augmentation. For instance, the same tomato plants background is used for any tomato classes. Summary of iteration 2:Dataset “Plant Village” with data augmentation and adding new classes from dataset “Image-Net”. This should help the model to:Not focus on the background image. Summary of the iteration 2-Bis:Dataset: “Plant Village” with data augmentation and adding new classes from dataset “Image-Net”.
Keywords to know before you start reading papers on GANs
       
Over the past few weeks, I have probably read a dozen papers on GANs (and its variants) and tinkered around with their code on custom images (courtesy open-source Github repos). While most of these papers are brilliantly written, I wish there were a few keywords that I had known before I plunged into these academically-written manuscripts. Below I will discuss a few of them and hope it saves you some time (and frustration) when you encounter them in papers. As for the pre-requisites, I am assuming most of you already know what Discriminator and Generator networks are with regard to GANs. For those of you who might need a recap:A Generator network’s aim to produce fake images that look real.
Optimise Deep Learning Workflow with Fast S3
       
Optimise Deep Learning Workflow with Fast S3I know most data scientists do not care about storage, and they shouldn’t. However, having a fast S3 object storage in the system would definitely help optimise our deep learning workflow. Optimising DL Workflow with Fast S3Prior to machine learning and deep learning, I spent 10+ years on big data (Hadoop & Spark) and devOps (cloud, platform-as-a-service). How does a fast S3 object storage help optimise our DL workflow? By using a fast S3 like FlashBlade S3, and tuning number of parallel reads and buffer size, it is possible to reach comparable performance to that of reading from fast NFS.
Pretrained Transformers as Universal Computation Engines
       
Pretrained Transformers as Universal Computation EnginesTransformers have been successfully applied to a wide variety of modalities: natural language, vision, protein modeling, music, robotics, and more. This enables the models to utilize generalizable high-level embeddings trained on a large dataset to avoid overfitting to a small task-relevant dataset. To illustrate this, we take a pretrained transformer language model and finetune it on various classification tasks: numerical computation, vision, and protein fold prediction. We refer to this as “Frozen Pretrained Transformer”. Furthermore, we find the language-pretrained frozen transformers converge faster than the randomly initialized frozen transformers, typically by a factor of 1-4x, indicating that language might be a good starting point for other tasks.
Create forecasting systems faster with automated workflows and notifications in Amazon Forecast
       
Forecast enables notifications by onboarding to Amazon EventBridge, which lets you activate these notifications either directly through the Forecast console or through APIs. Create rules for Forecast notifications through EventBridgeTo create your rules for notifications, complete the following steps:On the Forecast console, choose your dataset. For this post, we choose Forecast Dataset Import Job State Change because we’re interested in knowing when the dataset import is complete. Ranjith Kumar Bodla is an SDE in the Amazon Forecast team. Shannon Killingsworth is a UX Designer for Amazon Forecast and Amazon Personalize.
The Evolution of Facial Recognition — A Case Study in the Transformation of Deep Learning
       
The Evolution of Facial Recognition — A Case Study in the Transformation of Deep LearningMachine learning has often been described as the study of “algorithms that create algorithms”. In reality, humans have a heavy role in ensuring that machine learning algorithms work with the given data. Although the No Free Lunch Theorem gives a theoretical appeal for the impossibility of a “universal learner”, deep learning is a huge step in that direction. Neural networks are — if you will — the “algorithms that create machine learning algorithms.” Deep learning is differentiated from machine learning by the massive parametrization of its models. As such, deep learning models are not only more powerful than machine learning models, but much more generalizable across different contexts.
Logistic Regression in real-life: building a daily productivity classification model
       
Scatter plot of the target values for training (circle) and values predicted by the Linear Regression model (triangle). Output of the Linear Regression Model. But even with seemingly encouraging results, a Linear Regression model has a few limitations when it comes to classification tasks:Implies outcome values have a specific order. For a model with only one feature, or predictor, the link function g can be described as:Definition of the link function of a Logistic Regression model. Mathematically speaking, what you achieved is the inverse of the logit function, a function that’s called logistic function.
How To Deploy Machine Learning Models
       
How To Deploy Machine Learning ModelsImage Created by AuthorJupyter notebooks are where machine learning models go to die. In general, companies don’t care about state-of-the-art models, they care about machine learning models that actually create value for their customers. For machine learning code you should also describe and/or link to experiments that were run so people can view the process of creating your models. If things have gone well you have a front-end web app running on your machine that allows you to access your machine learning model predictions. How to Learn MoreI hope this overview on how to deploy machine learning models helped you understand the basic steps to deploying your models.
Cervical Cancer Prediction and Boruta analysis (R)
       
Then we apply the two functions onto the columns and we can establish an attribute that represents cervical cancer later on. A positive result does not mean that the patient suffers from Cervical cancer, but the likelihood increases the more positives a patient receives. ConclusionTo sum up, Cervical cancer is one of the most life-threatening diseases out there which is responsible for thousands of deaths per year. Cervical cancer screening for individuals at average risk: 2020 guideline update from the American Cancer Society. Retrieved February 22, 2021, from https://medium.com/opex-analytics/why-you-need-to-understand-the-trade-off-between-precision-and-recall-525a33919942— — —Dataset: UCI Machine Learning Repository: Cervical cancer (Risk Factors) Data Set.
Perform K-Means Clustering in R
       
One of the common questions regarding the K-means algorithm is if it can handle non-numeric data. data("iris")?irisEdgar Anderson's Iris Data DescriptionThis famous (Fisher's or Anderson's) iris data set gives the measurements in centimeters of the variables sepal length and width and petal length and width, respectively, for 50 flowers from each of 3 species of iris. Preview of the dataBefore I performed the K-means algorithm, I first checked the labels to see how many clusters were presented in this dataset. Then, I fitted a K-means model with k = 3 and plotted the clusters with the “fpc” package. In this blog, I’ve discussed fitting a K-means model in R, finding the best K, and evaluating the model.
Normal Equation in Python: The Closed-Form Solution for Linear Regression
       
It works only for Linear Regression and not any other algorithm. Normal Equation is the Closed-form solution for the Linear Regression algorithm which means that we can obtain the optimal parameters by just using a formula that includes a few matrix multiplications and inversions. To calculate theta , we take the partial derivative of the MSE loss function (equation 2) with respect to theta and set it equal to zero. This is the Normal Equation —The Normal Equation; source: Andrew NgIf you know about the matrix derivatives along with a few properties of matrices, you should be able to derive the Normal Equation for yourself. The AlgorithmCalculate theta using the Normal Equation.
Xgboost Regression Training on CPU and GPU in Python
       
Now that we have software that is able to use the GPU let's train some models! For the modeling task, I will load data containing 1017209 rows and the following columns:Store — store identification numberDate — date of the store sales recordDayOfWeek — weekday of the DateSales — revenue from goods sold during that Date (only available in train_data.csv)ShopOpen — boolean flag if a shop was open during that Date (if not open, Sales should be 0)Promotion — boolean flag if any promotions were done during that DateStateHoliday — factor variable if the Date is a state holiday or notSchoolHoliday — factor variable if the Date is a school holiday or notStoreType — factor variable describing the type of a storeAssortmentType — factor variable describing the assortment type of a storeSnippet of data; Image by authorThe task is to model the Sales (Y) variable using all the other features. Note that all the other features are categorical. After adding two additional features regarding the day of the month and the month of the year we can inspect how many unique categorical values there are in the dataset:Unique categorical levels; Image by authorDistribution of the Y variable:Y variable histogram; Image by authorThe final dimensions of the full Y and X matrices:Dimensions; Image by authorThe X matrix has 1150 features and more than a million rows. An ideal real-life dataset to test out the computation speeds!
Ethics in Data Science or: How I Learned to Start Worrying and Question the Process
       
Ethics in Data Science or: How I Learned to Start Worrying and Question the ProcessSo, you’ve created a model. Is your model good or is it doing good? — Photo by Gabriele Lasser on UnsplashAs part of my working through the immersive data science program at Metis I completed a short side presentation on the concerns of ethics in technology, particularly in relation to our work in data science. I focused on the two books above, Weapons of Math Destruction by Cathy O’Neil and Algorithms of Oppression by Safiya Umoja Noble. Is your model good or is it doing good?
Optimise Deep Learning Workflow with Fast S3
       
Optimise Deep Learning Workflow with Fast S3Building reproducible and scalable deep learning system with fast S3 as the central data and model repository. However, having a fast S3 object storage in the system would definitely help optimise our deep learning workflow. How does a fast S3 object storage help optimise our DL workflow? Since FlashBlase S3 is very fast, it is feasible to directly read the S3 data into the training iteration. By using a fast S3 like FlashBlade S3, and tuning number of parallel reads and buffer size, it is possible to reach comparable performance to that of reading from fast NFS.
Implementing Transfer Learning from RGB to Multi-channel Imagery
       
In this article, we shall be exploring two distinct concepts implemented within the Semantic Segmentation part of the project —Transfer Learning for Multi-Channel InputWhat is Transfer Learning? Transfer learning is a machine learning technique for the re-use of a pre-trained model on a new problem. Given the small number of images, transfer learning seemed like a good path to explore. Typically with transfer learning, we exclude the final layer and replace it with layers more specific to the new task. This should be set to true if we’re making inference with the pre-trained model as opposed to implementing transfer learning.
Tomorrow’s car silicon brain, how is it made?
       
The bandwidth and power consumption of such external memory is a bottleneck to the high system performance reacquired. Thus more computation units and higher peak performance can be achieved with a smaller computation unit design. [3] describe three main techniques to improve performance by optimizing Computation Unit designs:Low Bit-width Computation Unit: The bit-width of input arrays directly impacts the size of computation units. The bit-width of input arrays directly impacts the size of computation units. CAVBench [8] currently is a good starting point for autonomous driving computing system performance evaluation.
Object Detection Explained: R-CNN
       
Object Detection Explained: R-CNNObject detection consists of two separate tasks that are classification and localization. The key concept behind the R-CNN series is region proposals. In the following blogs, I decided to write about different approaches and architectures used in Object Detection. Extract region proposalsSelective Search is a region proposal algorithm used for object localization that groups regions together based on their pixel intensities. PaperRich feature hierarchies for accurate object detection and semantic segmentationRelated Articles
Compute cost and environmental impact of Deep Learning
       
Compute used to train the state of the art deep learning models continues to grow exponentially, exceeding the rate of Moore’s Law by a wide margin. Estimate the cloud compute cost and carbon emissions of the full R&D required for a new state of the art DL model. NAS can approximate the compute cost of the full cycle of Research and Development to find a new state of the art model. Image from [2]Co2 emissions and monetary cost of training of some well known deep learning models. Financial analysts such as ARK Invest predict Deep Learning market cap to grow from $2 trillion in 2020 to $30 trillion in 2037 [3].
A Gentle Introduction to XGBoost Loss Functions
       
Tutorial OverviewThis tutorial is divided into three parts; they are:XGBoost and Loss Functions XGBoost Loss for Classification XGBoost Loss for RegressionXGBoost and Loss FunctionsExtreme Gradient Boosting, or XGBoost for short, is an efficient open-source implementation of the gradient boosting algorithm. XGBoost can be installed as a standalone library and an XGBoost model can be developed using the scikit-learn API. # check xgboost version import xgboost print(xgboost.__version__) 1 2 3 # check xgboost version import xgboost print ( xgboost . You can see a full list here:Next, let’s take a look at XGBoost loss functions for regression. The XGBoost objective function used when predicting numerical values is the “reg:squarederror” loss function.
The Most In-Demand Skills for Data Scientists in 2021
       
An in-depth analysis of the most in-demand skills from webscraping over 15,000 Data Scientist job postings. IntroductionI just wanted to start off by saying that this is heavily inspired by Jeff Hale’s articles that he wrote back in 2018/2019. I’m writing this simply because I wanted to get a more up-to-date analysis of what skills are in demand today, and I’m sharing this because I’m assuming that there are people out there that also want to see an updated version of the most in-demand skills for data scientists in 2021. Take what you want from this analysis — it’s obvious that the insights gathered from webscraping job postings are not a perfect correlation to what data science skills are actually most-demanded. However, I think this gives a good indication of what general skills you should focus more on, and likewise, stray away from.
MLOps for Research Teams
       
To make MLOps more concrete, we’ll look at what problems it solves for research teams. MLOps solves these problems and allows research teams to achieve their goals despite the complexity that comes from dealing with large datasets, code, and machine learning models. Most machine learning teams should have an architecture that includes the following:Our MLOps architecture consists of several integrated components that together address the difficulties most teams face. While some research teams still operate without MLOps tools or best practices, we believe MLOps has become an essential ingredient for nearly all teams. We love finding the right MLOps architecture for machine learning research teams.
How to Avoid Burnout as an Ambitious New Data Scientist
       
When you’re done work, you’re done work. Therefore, it’s imperative that when you’ve completed all of your tasks for the day, you call it a day. To ensure that I don’t get sucked back into work, I’ll take time to work out, work on an article for Medium, clean, or get dinner started. As a new data scientist, it’s important to have some healthy habits in place so that when work gets crazy, you have some constants that will help you avoid burnout and will keep you healthy and productive. A common complaint of people suffering burnout is that they feel like they are stuck in a rut doing the same thing day in day out.
Three ways to run Linear Mixed Effects Models in Python Jupyter Notebooks
       
Accessing LMER in R using rpy2 and %RmagicThe second option is to directly access the original LMER packages in R through the rpy2 interface. The rpy2 interface allows users to toss data and results back and forth between your Python Jupyter Notebook environment and your R environment. The next set of lines install rpy2 then uses rpy2 to install the lme4 and lmerTest packages. Next, you’ll need to activate the Rmagic through the code, in you Jupyter Notebook cell by running the following code. %load_ext rpy2.ipythonAfter this, any Jupyter Notebook cell starting with %%R will allow you to run R command from your notebook.
Building a data dashboard for housing prices using Plotly-Dash in Python
       
For this project, I’ll be using Plotly-Dash, which is a Python library for creating analytical data web apps. Lastly, I called the Geonames API to download latitude, longitude and population data for each city. Note that you’ll need to register on Geonames.org to use their Python API, and the key is your username:Code by author. I ran some quick statistics on the success rate of the API for pulling population data, and discovered that it successfully downloaded population data for ~ 94% of the cities. Dash uses dictionaries and lists extensively in its keyword arguments, so getting familiar with these Python concepts is definitely helpful when building a Dash app.
Building a full-stack spam catching app — 3. Frontend & Deployment
       
Building a full-stack spam catching app — 3. In the last post, we built out the backend for our app by creating the spam classifier and a small Flask app to serve the model. At this endpoint, we simply call render_template on our HTML file. Finally, we have the event listeners, which are what actually connect these JavaScript functions to our HTML page. Knowing a little bit of HTML, CSS, and JavaScript goes a long way in giving life to a data science project!
Data Augmentation for Brain-Computer Interface
       
Brain-computer interface has always been facing severe data-related issues such as lack of sufficient data, lengthy calibration time and data corruption. In this article, I’ll explain the issue of creating enough training data in the context of non-invasive BCIs and present a non-exhaustive list of data augmentation techniques for EEG datasets. BCI & Data AcquisitionBrain-computer interface (BCI) systems are designed to connect the brain and external devices for several use cases. Data augmentation & BCITwo approaches exist to generate augmented data. Image by AuthorData Augmentation Techniques for EEGData Augmentation helps increase the available training data, facilitate the use of more complex DL models.
Chip Huyen on Her Career, Writing, and Machine Learning
       
You share a lot about machine learning in production, such as through your writing and tweets. Maybe people in data and machine learning are unaware of those solutions, or too lazy to learn how to use them. I wonder how we can help smaller companies also benefit from machine learning.) What’s your advice for small to medium size companies starting to work on deploying their first machine learning models? The class caught the attention of various machine learning teams and led to her role at NVIDIA.
Why you should monitor your pictures’ sharpness when deploying Computer Vision models
       
As you probably know, the pictures are encoded into n-dimensional arrays (1 layer for grayscale pictures and 3 for RGB ones). If you are not at ease with this concept, I recommend this article to you:And, when it comes, for example, to tabular datasets, we can monitor the statistical characteristics of each feature (min, max, mean, standard deviation, etc.) So we need to find a way to calculate the variations within the picture, from one pixel to another. Gradient calculation of the [1, 3, 0] vector — Image by AuthorFrom 1 to 3, the function is “y = 2x”, its derivative being “2”. The workshop team can monitor the camera sharpness on their control screen so they decided to clean the camera at some point during this period.
Image Feature Extraction Using PyTorch
       
When we want to cluster data like an image, we have to change its representation into a one-dimensional vector. This model is mostly used for image data. Therefore, this neural network is the perfect type to process the image data, especially for feature extraction [1][2]. K-Means AlgorithmAfter we extract the feature vector using CNN, now we can use it based on our purpose. At first, the K-Means will initialize several points called centroid.
Enter the j(r)VAE: divide, (rotate), and order… the cards
       
Enter the j(r)VAE: divide, (rotate), and order… the cardsIntroduction to joint (rotationally-invariant) VAEs that can perform unsupervised classification and disentangle relevant (continuous) factors of variation at the same time. In this case, each encoded object corresponds to a single latent vector, and we can simply cluster the points in the latent space. Rather than working with the standard MNIST data set, we are going to make our own data set of playing card suits, with monochrome clubs, spades, diamonds, and hearts. From left to right (a = 12, s = 1), (a = 12, s = 10), (a = 120, s = 1), and (a = 120, s = 10). From left to right (a = 12, s = 1), (a = 12, s = 10), (a = 120, s = 1), and (a = 120, s = 10).
Advanced YoloV5 tutorial — Enhancing YoloV5 with Weighted Boxes Fusion
       
Advanced YoloV5 tutorial — Enhancing YoloV5 with Weighted Boxes FusionPhoto by Eric Karim Cornelis on UnsplashThere are tons of YoloV5 tutorials out there, the aim of this article is not to duplicate the content but rather extend on it. Most of the popular object detection models like YoloV5, EfficientDet use a command-line interface to train and evaluate rather than a coding approach. Weighted Boxes fusion is a method to dynamically fuse the boxes either before training (which cleans up the data set) or after training (making the predictions more accurate). You can also try to use it after predicting the bounding boxes with YoloV5 in the same way. Those are most of the aspects that you can easily control and use to boost your performance with YoloV5.
Making Your First Kaggle Submission
       
data = pd.concat(objs = [df, df_test], axis = 0).reset_index(drop = True)Our target variable will the Survived column — let’s keep it aside. target = ['Survived']First we check for null values in columns of training data. df.isnull().sum()sum of null values in colsRight away we can observe that three columns seem quite unncessary for modelling. data = data.drop(['PassengerId', 'Ticket', 'Cabin'], axis = 1)Now we move on to other columns that have null values. data.Age.fillna(data.Age.median(), inplace = True)data.Fare.fillna(data.Fare.median(), inplace = True)data.Embarked.fillna(data.Embarked.mode()[0], inplace = True)Doing that, we now have no null values in our data.
Deeper Neural Networks Lead to Simpler Embeddings
       
Deeper Neural Networks Lead to Simpler EmbeddingsRecent research is increasingly investigating how neural networks, being as hyper-parametrized as they are, generalize. Perhaps one of the most intriguing, though, is one proposing that deeper neural networks lead to simpler embeddings. This makes neural networks more likely — by chance — to find simple solutions rather than complex ones. Huh et al begin by analyzing the rank of linear networks — that is, networks without any nonlinearities, like activation functions. This paper’s fascinating contribution argues instead that simpler solutions are in fact better, and that more successful, highly parameterized neural networks arrive at those simpler solutions because, not despite, their parametrization.
Sequence Dreaming with Depth Estimation in PyTorch
       
Sequence Dreaming with Depth Estimation in PyTorchWhile Big Sleep is still the Big Hype on reddit, I decided to take another look at open questions in the context of deep dreaming on consecutive frames, i.e. Inspired by preceding work, such as this Caffe implementation, I wanted to include more recent insights on single class dreaming (see my previous post) and depth estimation, in addition to integrating everything into an up-to-date PyTorch framework. Sequence dreaming with the tricycle class. I approach this problem by warping the previous dream pattern onto the next frame and parametrizing the strength of the update with the flow vector field. In each step, the vector field is computed with the Farneback method (provided by opencv2) or alternatively the Spatial Pyramid Network (SPyNet).
Speed-up your Pandas Workflow by changing a single line of code
       
Speed-up your Pandas Workflow by changing a single line of codePhoto by Tim Gouw on UnsplashPandas is one of the most popular Python libraries used for data explorations and visualization. Pandas do not take benefit of all the available CPU cores to scale up the computations. In this article, you can read how to scale up the performance of Pandas library computation using Modin, just by changing one line of code. Unlike other distributed libraries, Modin can be easily integrated and compatible with Pandas library and has similar APIs. (Source), CPU cores utilization in Pandas and ModinFor a large data science workstation or a cluster having a lot of CPU cores, Modin performance increasingly exponentially, as follows full utilization of the CPU cores.
4 Easy Steps for Implementing CatBoost
       
Now is the time to learn this powerful library, and below is how you can implement it in four easy steps. Here are the main installation commands:!pip install catboost !pip install ipywidgets !jupyter nbextension enable — py widgetsnbextensionHere are the main import commands:from catboost import CatBoostRegressor from sklearn.model_selection import train_test_split import numpy as np import pandas as pdAs you can see, there are only just a few lines of code that you need for installing and importing. Here are the main dataset defining commands:dataframe = pd.read_csv('file_path_to_your_dataset.csv') X = dataframe[['X_feature_1', 'X_feature_2', etc.,]] y = dataframe['target_variable'] X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.75, random_state=42)Apply ModelWhen applying the CatBoost model, it works similarly to other sklearn approaches. However, the most important part is to designate your categorical variables, so that you can get the most out of your CatBoost model. References[1] Photo by Manja Vitolic on Unsplash, (2018)[2] Yandex, CatBoost, (2021)[3] Photo by Christopher Gower on Unsplash, (2017)[4] Photo by krakenimages on Unsplash, (2020)
Two outlier detection techniques you should know in 2021
       
Two outlier detection techniques you should know in 2021Photo by Alexander Andrews on UnsplashAn outlier is an unusual data point that differs significantly from other data points. Elliptic Envelope and IQR are commonly used outlier detection techniques. The intuition behind the Elliptic Envelope (Image by author)The Elliptic Envelope method considers all observations as a whole, not individual features. Wait till loading the Python code (Code snippet 5)Image by authorIQR-based detectionAn IQR-based detection is a statistical approach. ImplementationWait till loading the Python code (Code snippet 6)Image by authorThe outlier indices of each feature are very useful.
What does the h-index tell us that we could not know without it?
       
What does the h-index tell us that we could not know without it? A more complex way is to calculate their h-index (h) which is supposed to also take into account how these N citations are distributed across papers of a researcher. If a researcher's h-index is h, then it means that h is the greatest number for which the statement "he or she has h articles with at least h citations" holds true. Note that this bound is tight: if the N citations are distributed equally between the 1st √N papers, then the bound is reached and we have h=√N. The 1st and 2nd rows show h-index as a function of the number of citations (N) in linear and log-log scales, respectively.
Mixture Density Networks: Probabilistic Regression for Uncertainty Estimation
       
Types of UncertaintyThere are two major kinds of uncertainty — Epistemic and Aleatoric Uncertainty (phew, that was quite a mouthful). And have another learned parameter(a latent representation) which decides how to mix these gaussian components. Mixture Density NetworksMixture Density Networks are built from two components — a Neural Network and a Mixture Model. Weight Regularization — Applying L1 or L2 regularization to the weights of the neurons which compute the mean, variances and mixing components. SummaryWe have seen how important uncertainty is important to business decisions and also explored one way of doing that using Mixture Density networks.
Comparing Keras and PyTorch on sentiment classification
       
Comparing Keras and PyTorch on sentiment classificationPhoto by Karolina Grabowska from PexelsAfter part one which covered an overview of Keras and PyTorch syntaxes, this is part two of our comparison of Keras and PyTorch! Dataset of IMDB movie reviewsWe will use IMDB dataset, a popular toy dataset in machine learning, which consists of movie reviews from the IMDB website annotated by positive or negative sentiment. Both Keras and PyTorch have helper functions to download and load the IMDB dataset. embedding_dim = 128hidden_dim = 64Let’s start with the implementation in Keras (credits to the official Keras documentation)Now, let’s implement the same in PyTorch. So with samples.transpose(0, 1) we effectively permute the first and second dimensions to fit PyTorch data model.
Deep learning-based cancer patient stratification
       
Patient Stratification Using Deep LearningMolecular patterns or latent factors can stratify patients based on prognosis or response to drugs, or any other clinical variable. Can we predict subtypes using latent factors? On the right side of Figure 3, we color-coded the 2D projection of latent factors based on the CMS status. Figure 3: Predicting subtypes using latent factors obtained via deep learning is more accurate. As you can see, in many cancers, when using latent factors, we push this accuracy metric to a higher level.
Reinforcement Learning For Mice
       
Reinforcement Learning For MiceReinforcement LearningReinforcement learning(RL) is a type of deep learning that has been receiving a lot of attention in the past few years. He is making different types of mazes and is observing the mice while they were exploring different mazes. The mouse is learning intelligent behavior in complex dynamic environments. Value Function: Almost all reinforcement learning algorithms are based on estimating value functions — functions of states that estimate how good it is for the agent to be in a given state. Markov Decision Process:Markov Decision Process (MDP) is a mathematical framework to describe an environment in reinforcement learning.
Google AI Blog: Leveraging Machine Learning for Game Development
       
ChimeraWe developed Chimera as a game prototype that would heavily lean on machine learning during its development process. For the game itself, we purposefully designed the rules to expand the possibility space, making it difficult to build a traditional hand-crafted AI to play the game. With each iteration, the quality of the training data improved, as did the agent’s ability to play the game. We found that a relatively simple neural network was sufficient to reach high level performance against humans and traditional game AI. We hope this work will inspire more exploration in the possibilities of machine learning for game development.
RAPIDS and Amazon SageMaker: Scale up and scale out to tackle ML challenges
       
In this post, we combine the powers of NVIDIA RAPIDS and Amazon SageMaker to accelerate hyperparameter optimization (HPO). This RAPIDS with SageMaker HPO example is part of the amazon-sagemaker-examples GitHub repository, which is integrated into the SageMaker UX, making it very simple to launch. The key ingredients for cloud HPO are a dataset, a RAPIDS ML workflow containerized as a SageMaker estimator, and a SageMaker HPO tuner definition. SageMaker estimatorNow that we have our dataset, we build a RAPIDS ML workflow and package it using the SageMaker Training API into an interface called an estimator. Search strategyIn terms of HPO search strategy, SageMaker offers Bayesian and random search.
How to load and store MLFlow models in R on DataBricks
       
How to load and store MLFlow models in R on DataBricksDatabricks has became an important building block in Cloud Computing, especially now, after Google announces the launch of Databricks on Google Cloud. It is true that Databricks supports both R, Python and Scala codes, but different weaknesses are found when working with MLFlow and R, specifically when trying to register a ML model. This is a pain in the neck if we want to load MLFlow models in our R notebooks, but there is a solution. The supported magic commands are: %python , %r , %scala , and %sql . Now, after this “trick” the model has been correctly registered in the Model repository and it is ready to be used.
Double Descent Behavior Exists in Semi-Supervised Learning — Part 1
       
Double Descent Behavior Exists in Semi-Supervised Learning — Part 1Recently in graduate deep learning class, our project group decided to read an interesting paper about generalization: Reconciling modern machine learning practice and the bias-variance trade-off [1]. Then, we came up with a research question,“Can we empirically observe similar double descent behaviors when we trained those models in a semi-supervised learning setting? More interestingly, [1] shows the existence of double descent across a wide spectrum of models and datasets. Double Descent Risk Curve for RFF model on MNIST [1]. ConclusionTo wrap it up, this paper introduced the existence of double descent risk curve, reconciling the U-shaped bias-variance trade-off.
How to Run 30 Machine Learning Models with a Few Lines of Code
       
MACHINE LEARNINGHow to Run 30 Machine Learning Models with a Few Lines of CodeImage by Keira Burton. Although the scikit-learn library makes our lives easier by making possible to run models with a few lines of code, it can also be time-consuming when you need to test multiple models. It runs 30 machine learning models in just a few seconds and gives us a grasp of how models will perform with our dataset. import pyforestimport warningswarnings.filterwarnings("ignore")from sklearn import metricsfrom sklearn.metrics import accuracy_scoreNow, let's import the dataset we will be using from Kaggle. import lazypredictfrom lazypredict.Supervised import LazyClassifierFinally, let's run the models and see how it goes.
Machine Learning: The Great Stagnation
       
Machine Learning: The Great StagnationThis blog post generated a lot of discussion on Hacker News — many people have reached out to me giving more examples of the stagnation and more examples of projects avoiding it. However, this risk free approach is growing in popularity and has specifically permeated my field “Machine Learning”. “Useful” Machine Learning research on all datasets has essentially reduced to making Transformers faster, smaller and scale to longer sequence lengths. There is still substantial innovation happening in Machine Learning just not from Data Scientists or Machine Learning Researchers. Here are the projects that I believe represent a glimmer of hope against the Stagnation of Machine Learning.
Everything Product People Need to Know About Transformers (Part 3: BERT)
       
BERT ExplainedFour months to the day after OpenAI introduced GPT, Google published BERT: Bidirectional Encoder Representations from Transformers. MLM entails passing BERT a sentence like “I sat [MASK] my chair” and requiring BERT to predict the masked word. Applications of BERTUnderstanding the BERT training tasks is essential for determining its applications. Reuben can do this by training BERT further on a set of sentence pairs that more directly fit this pattern. So, how can you know if an application is really feasible for a transformer model?
Real Life Meta-Learning: Teaching and Learning to Learn
       
By learning the best ways we teach and learn (meta-teaching and meta-learning), our teaching and learning efforts will do more for everyone. Intuitively, meta-learning is about learning how to learn, or more specifically, learning how to learn more effectively. Teaching and Learning From An Optimization LensThroughout this article, we’ve been discussing how we can improve our learning and teaching capabilities. However, the optimization objective above only defines how to learn to learn for a single task. Because we improve our teaching by observing what teaching methods lead to students learning to learn more effectively, and which methods do not.
TensorFlow.js Blueprint App
       
The main idea is that someone who would like to code logic and build a model with TensorFlow.js, should be able to copy-paste from my sample app easily. For ML model training to be effective, we need to normalize pairs of (x, y) coordinates. You should shuffle the data before it is sent for training, this will improve chances for accurate model training. This helps to understand if there is no overfitting during training, if the model performs well with validation data too, this means model training runs fine. Next, I went on model training and explained why fitDataset is recommended way to train a neural network in TensorFlow.js.
A Practical Guide to Implementing a Random Forest Classifier in Python
       
Data ExplorationIn this post we will be utilizing a random forest to predict the cupping scores of coffees. Building the Random ForestNow the data is prepped, we can begin to code up the random forest. Now that we have parameter values from the randomized grid search, we can use these as a starting point for a grid search. A grid search works, in principle, similarly to the randomized grid search as it will search through the parameter space we define. Instantiating a grid search is also similar to instantiating the random grid search.
Universal Adversarial Perturbations Could be a Threat to Autonomous Vehicles
       
How impactful universal perturbations could be to a subset of COCO only including the categories related to autonomous driving? IntroductionThis paper presents the impact of the universal perturbations on object detection in five categories related to autonomous vehicles: person, stop sign, car, truck and traffic light. It could rather hinder evaluating the impact of universal perturbations particularly to street settings where autonomous vehicles would be mostly encountered by. Universal adversarial perturbations against object detection. Universal adversarial perturbations.
Running Keras using Cygwin. The Purpose
       
Therefore the early steps I took were throughout Cygwin installer and only by using it. Since this project is developed for environments that don't necessarily have an installed Python I wanted the entire work to be done from Cygwin. I returned therefore to the installation process and added Python in the Select Packages screen (Clearly, Cygwin offers many versions.. I was now confident that running Keras using Cygwin is two minutes away. The package that I decided to test was Pyinstaller which is installed using a regular pip.
Can Transformers Solve This 90-Year-Old Classic Computer Science Problem Better Than Human Algorithms?
       
Can Transformers Solve This 90-Year-Old Classic Computer Science Problem Better Than Human Algorithms? The Travelling Salesman Problem was formulated in 1930, and is a classical computer science problem for optimization. The Travelling Salesman Problem is a useful way to visualize and think about this general field of combinatorial optimization. Approaches to the Travelling Salesman Problem can thus be utilized in a wide array of applications. The simulated annealing algorithm is shown visually below on the Travelling Salesman problem.
Audio Deep Learning Made Simple: Sound Classification, Step-by-Step
       
Training data with audio file paths and class IDsScan the audio file directory when metadata isn’t availableHaving the metadata file made things easy for us. Audio Pre-processing: Define TransformsThis training data with audio file paths cannot be input directly into the model. This audio pre-processing will all be done dynamically at runtime when we will read and load the audio files. So we keep only the audio file names (or image file names) in our training data. They capture the essential features of the audio and are often the most suitable way to input audio data into deep learning models.
Complete tutorial on how to use Hydra in Machine Learning projects
       
Complete tutorial on how to use Hydra in Machine Learning projectsLearn everything you need to know on how to use Hydra in your machine learning projects. How hydra handles different runsWhenever a program is executed using python main.py Hydra will create a new folder in outputs directory with the following naming scheme outputs/YYYY-mm-dd/HH-MM-SS i.e. Using Hydra for ML projectsNow you know the basic workings of hydra, we can focus on using Hydra to develop a machine learning project. Usage --cfg [OPTION] Valid OPTION arejob : Your config file: Your config file hydra : Hydra’s config: Hydra’s config all : job + hydraThis is useful for quick debugging when you want to check what is being passed to a function. Hope this helps you in using Hydra in your projects.
Building a Deep Learning Image Captioning Model on Azure
       
Building a Deep Learning Image Captioning Model on AzureWhat’s going on in this image? In onboarding for my new role leveraging AI/ML, I set out to build a deep learning model on the cloud from scratch. To build the deep learning model, Jason Brownlee’s image captioning model served as a jumping point. I used Azure Databricks and Azure Machine Learning as the platforms for creating my deep learning model. Like many, I found Andrew Ng’s Deep Learning and Machine Learning courses immensely helpful in learning about the deep learning space.
Gradient Descent Optimization With Nadam From Scratch
       
In this tutorial, you will discover how to develop the gradient descent optimization with Nadam from scratch. Tutorial OverviewThis tutorial is divided into three parts; they are:Gradient Descent Nadam Optimization Algorithm Gradient Descent With Nadam Two-Dimensional Test Problem Gradient Descent Optimization With Nadam Visualization of Nadam OptimizationGradient DescentGradient descent is an optimization algorithm. Gradient Descent With NadamIn this section, we will explore how to implement the gradient descent optimization algorithm with Nadam Momentum. Gradient Descent Optimization With NadamWe can apply the gradient descent with Nadam to the test problem. ... # mhat = (mu * m(t) / (1 - mu)) + ((1 - mu) * g(t) / (1 - mu)) mhat = (mu * m[i] / (1.0 - mu)) + ((1 - mu) * g[i] / (1.0 - mu)) 1 2 3 .
Google AI Blog: Massively Parallel Graph Computation: From Theory to Practice
       
The proposed model, Adaptive Massively Parallel Computation (AMPC), augments the theoretical capabilities of MapReduce, providing a pathway to solve many graph problems in fewer computation rounds. We also show how the AMPC model can be effectively implemented in practice. Limitations of MapReduceIn order to understand the limitations of MapReduce for developing graph algorithms, consider a simplified variant of the connected components problem. We've been happy to see that the AMPC model has already been the subject of further study and are excited to learn what other problems can be solved more efficiently using the AMPC model or its practical implementations. To learn more about our recent work on scalable graph algorithms, see videos from our recent Graph Mining and Learning workshop.
Building AI that can understand variation in the world around us
       
But even the most state-of-the-art AI models, models that outperform humans in myriad ways, can struggle with the simple — for humans, at least — task of identifying a golden retriever whether it's viewed head-on, from the side, upside down, leaping through the air, or even covered in mud. It shows that an intuitive way to understand variation factors is to model them as a group of transformations. But how can we discover a data set’s symmetries using a minimal amount of supervision? Finally, tackling real data sets with group theory–based models is challenging because the group structure is not perfectly respected. Extending this idea to more realistic settings and data sets such as images with no artificial augmentation might prove to be a valuable approach going forward.
Explaining Bundesliga Match Facts xGoals using Amazon SageMaker Clarify
       
Bundesliga Match Facts powered by AWS provides a more engaging fan experience during soccer matches for Bundesliga fans around the world. Bundesliga Match FactsBundesliga Match Facts powered by AWS provides advanced real-time statistics and in-depth insights, generated live from official match data, for Bundesliga matches. AGG_METHOD – The aggregation method used to compute global SHAP values, which in our case is the mean of absolute SHAP values for all instances. – The aggregation method used to compute global SHAP values, which in our case is the mean of absolute SHAP values for all instances. Conclusion: Implications for Bundesliga Match FactsThe primary implications for Bundesliga Match Facts powered by AWS going forward are twofold.
Helmet detection error analysis in football videos using Amazon SageMaker
       
This visualization has significantly improved our understanding of when and how our helmet detection algorithms fail. In this post, we use static images from the competition data as an example to build a helmet detection model. Next, we used FasterRCNN with ResNet50 FPN as our helmet detection model and used a pretrained model based on COCO data within a PyTorch framework. The goal was not to build an award-winning helmet detection model, but to identify errors in specific images within an entire play with a relatively high-performing model. An ideal helmet detector should detect each and every helmet in each frame, thereby covering the entire area with green.
Perform interactive data processing using Spark in Amazon SageMaker Studio Notebooks
       
Amazon SageMaker Studio is the first fully integrated development environment (IDE) for machine learning (ML). Data engineers and data scientists can also use Apache Spark for preprocessing data and use Amazon SageMaker for model training and hosting. For more details on different connectivity methods, see Securing Amazon SageMaker Studio connectivity using a private VPC. Preprocess data and feature engineeringWe perform data preprocessing and feature engineering on the data using SageMaker Processing. For more information and other SageMaker resources, see the SageMaker Spark GitHub repo and Securing data analytics with an Amazon SageMaker notebook instance and Kerberized Amazon EMR cluster.
AI for AgriTech: Classifying Kiwifruits using Amazon Rekognition Custom Labels
       
This post seeks to demystify how AWS AI/ML services work together, and specifically show how you can generate labeled imagery, train machine vision models against that imagery, and deploy custom image recognition models using Amazon Rekognition Custom Labels. To do so, we first load our unlabeled training images into an Amazon Simple Storage Service (Amazon S3) bucket within our account, with each class being stored in its own folder under our bucket. Setting up Amazon RekognitionTo start using Amazon Rekognition, complete the following steps:On the Amazon Rekognition console, choose Use Custom Labels. For Policy, enter the provided JSON into the Amazon S3 bucket, to ensure that Amazon Rekognition can access that data to train the model. SummaryIn this post, we learned how to use Amazon Rekognition Custom Labels with an Amazon S3 folder labeling functionality to train an image classification model, deploy that model, and use it to conduct inference.
Gauss, Imposters, and Making Room for Creativity
       
But a high-level theoretical explanation or a personal reflection about work and identity can be just as useful — just as practical, even! Maxim Ziatdinov’s panoramic look at rotationally invariant variational autoencoders (rVAE) and their ascendancy among those who analyze imaging data for a living. We read posts like the two above in order to learn and grow, yet sometimes — more often than many might think — even well-read, seasoned data scientists experience imposter syndrome. The art and practice of connectionIt’s a common trope by now to suggest that data scientists need to be strong storytellers. — a TDS post (feel like writing one?).
Several Model Validation Techniques in Python
       
Stratified K-Fold Cross-ValidationThe stratified k-fold method is the extension of the simple k-cross-validation which is mainly used for classification problems. This technique is ideal when we have smaller datasets and we have to maintain the class ratio as well. The class ratio is preserved. In this article, we have seen different model validation techniques, each serving different purposes and are best suited in different scenarios. Before using any of these validation techniques, always take account of your computational resources, time limit, and the type of problem you are trying to solve.
Complete tutorial on how to use Hydra in Machine Learning projects
       
Complete tutorial on how to use Hydra in Machine Learning projectsLearn everything you need to know on how to use Hydra in your machine learning projects. How hydra handles different runsWhenever a program is executed using python main.py Hydra will create a new folder in outputs directory with the following naming scheme outputs/YYYY-mm-dd/HH-MM-SS i.e. Using Hydra for ML projectsNow you know the basic workings of hydra, we can focus on using Hydra to develop a machine learning project. Usage --cfg [OPTION] Valid OPTION arejob : Your config file: Your config file hydra : Hydra’s config: Hydra’s config all : job + hydraThis is useful for quick debugging when you want to check what is being passed to a function. Hope this helps you in using Hydra in your projects.
Hierarchical Clustering and Dendrograms in R for Data Science
       
Hierarchical Clustering and Dendrograms in R for Data ScienceIn the early stages of performing data analysis, an important aspect is to get a high level understanding of the multi-dimensional data and find some sort of pattern between the different variables- this is where clustering comes in. This blogpost will focus upon Agglomerative Hierarchical Clustering, its applications and a practical example in R. By now, two questions should arise in your mind. Next, we scale the coordinates to normalize our features with mean of 0 and variance of 1 (standardization). The good and the bad:Dendrograms are1) an easy way to cluster data through an agglomerative approach and 2) helps understand the data quicker. It is 2) computationally expensive as a poor agglomerative cluster has a time complexity of O(n³).
Locally Weighted Linear Regression in Python
       
In Locally weighted linear regression, we give the model the x where we want to make the prediction, then the model gives all the x(i) ’s around that x a higher weight close to one, and the rest of x(i) ’s get a lower weight close to zero and then tries to fit a straight line to that weighted x(i) ’s data. Weighting function (w(i) →weight for the ith training example)In Linear regression, we had the following loss function —Loss Function for Linear Regression; source: geeksforgeeks.orgThe modified loss for locally weighted regression —Loss Function for Locally weighted Linear Regression; source: geeksforgeeks.orgw(i) (the weight for the ith training example) is the only modification. In short, it only sums over the error terms for the x(i) ’s which are close to x . def wm(point, X, tau):# tau --> bandwidth# X --> Training data. For reference — Closed-form solution for locally weighted linear regressionAnd after calculating theta , we can just use the following formula to predict.
Choosing and Customizing Loss Functions for Image Processing
       
Enter the Loss FunctionA loss function plays a key role when training (optimizing) ML models. Here, the loss function’s role is to help the optimizer correctly predict these different levels of features — from basic patterns through to the final blood cells. The term loss function (sometimes called error function) is often used interchangeably with cost function. While the PerceptiLabs’ UI makes choosing a loss function a small detail, knowing which loss function to choose for a given use case and model architecture is really a big deal. For more information about loss functions, check out the following articles which do a great job of explaining some of them in detail:And for those just starting out with ML or need a refresher on loss functions, be sure to check out Gradient descent, how neural networks learn.
A Guide To Cleaning Text in Python
       
A Guide To Cleaning Text in PythonPhoto by The Creative Exchange on UnsplashText is a form of unstructured data. When we are working with textual data, we cannot go from our raw text straight to our Machine learning model. Instead, we must follow a process of first cleaning the text then encoding it into a machine-readable format. Let’s cover some ways we can clean text — In another post, I’ll cover ways we can encode text. Note: example code from Python Guides# creating a unicode stringtext_unicode = "Python is easy \u200c to learn" # encoding the text to ASCII formattext_encode = text_unicode.encode(encoding="ascii", errors="ignore") # decoding the texttext_decode = text_encode.decode() # cleaning the text to remove extra whitespaceclean_text = " ".join([word for word in text_decode.split()])print(clean_text) >>>> Python is easy to learn.
How I’m Overcoming My Fear of Math to Learn Data Science
       
I didn’t know much about the data science world until I stumbled onto Medium and discovered the Towards Data Science publication. In short, the article explains how theoretical data science (practiced usually by academics) is quite different than practical data science (which is usually practiced by industry professionals). Data science that is practiced in academia is, generally, much more mathematically intense than data science practiced in the industry. The author further goes on to explain how foundational data science skills, including data manipulation, data visualization, and exploratory data analysis, don’t actually require much math. By focusing on the four foundational data science skills, I will be competitive in the industry space, not as a data scientist in title, but as someone who can apply data science principles to solve problems.
Building a Deep Learning Image Captioning Model on Azure
       
Building a Deep Learning Image Captioning Model on AzureWhat’s going on in this image? In onboarding for my new role leveraging AI/ML, I set out to build a deep learning model on the cloud from scratch. To build the deep learning model, Jason Brownlee’s image captioning model served as a jumping point. I used Azure Databricks and Azure Machine Learning as the platforms for creating my deep learning model. Like many, I found Andrew Ng’s Deep Learning and Machine Learning courses immensely helpful in learning about the deep learning space.
Deep Image Quality Assessment
       
Deep Image Quality AssessmentImage by AuthorBefore diving deep into the world of image quality assessment I knew very little about the processing steps in the imaging pipeline. This article is about image impairment assessment with full-reference deep image quality metrics. Image aesthetic assessment, image impairment assessment and artefact visibility assessment. No-reference, where no reference image is provided, reduced-reference, where some information about the reference image is provided and full-reference. Deep image quality assessmentNew, large scale image quality datasets have enabled the development of image quality metrics based on deep learning models.
Modeling Protein-Ligand Interactions with Atomic Convolutional Neural Networks
       
Modeling Protein-Ligand Interactions with Atomic Convolutional Neural NetworksImage from Unsplash. The building blocks of the ACNN are two primitive convolutional operations: atom-type convolution and radial pooling. Radial pooling layers apply a radial filter (pooling function) with learnable parameters. Atomic convolution layers (atom-type convolution + radial pooling) can be stacked by flattening the outputs of the radial pooling layer and feeding them into another atom-type convolution. Atomic Convolutional Networks for Predicting Protein-Ligand Binding Affinity.
Towards the end of deep learning and the beginning of AGI
       
Towards the end of deep learning and the beginning of AGIHow recent neuroscience research points the way towards defeating adversarial examples and achieving a more resilient, consistent and flexible form of artificial intelligence Javier Ideami 1 day ago·14 min readPainting by the author Javier Ideami@ideami.comPainting by the author Javier Ideami@ideami.comAdversarial examples are a hot research topic in deep learning nowadays. This fits nicely with Mountcastle’s idea of a single circuit that gets replicated many, many times (volume matters, but is volume enough to push today’s deep learning systems towards AGI? It is time to return to the adversarial examples and the status of the deep learning field. Just as when we use ensembling in deep learning, we are betting on thousands of angles on the problem, not just one. As Jeff points out, deep learning leaders like Jeff Hinton have already been working for quite some time in trying to make deep learning models more flexible (see capsule networks).
Gradient Descent With Nesterov Momentum From Scratch
       
In this tutorial, you will discover how to develop the Gradient Descent optimization algorithm with Nesterov Momentum from scratch. Tutorial OverviewThis tutorial is divided into three parts; they are:Gradient Descent Nesterov Momentum Gradient Descent With Nesterov Momentum Two-Dimensional Test Problem Gradient Descent Optimization With Nesterov Momentum Visualization of Nesterov MomentumGradient DescentGradient descent is an optimization algorithm. Gradient Descent With Nesterov MomentumIn this section, we will explore how to implement the gradient descent optimization algorithm with Nesterov Momentum. Gradient Descent Optimization With Nesterov MomentumWe can apply the gradient descent with Nesterov Momentum to the test problem. PapersBooksAPIsArticlesSummaryIn this tutorial, you discovered how to develop the gradient descent optimization with Nesterov Momentum from scratch.
How to use MLflow on AWS to better track machine learning experiments
       
How to use MLflow on AWS to better track machine learning experimentsPhoto by Nicolas Thomas on UnsplashWe all know how painful keeping track of your machine learning experiments can be. Quickstart using MLFlow trackingMLflow tracking is a component that’ll help you log your machine learning experiments very easily. Metrics, parameters, duration, status, git commit:Image by the author — MLflow UI (run details)as well as model output and artifacts:Image by the author — MLflow UI (run outputs)Set up a tracking server on AWSSo far, we’ve used MLflow locally. Fortunately, setting up a remote MLflow tracking server is quite easy. Image by the author — MLflow UI (on a remote server)Quick notesSetting up MLflow as we did is fine for small projects within relatively small teams.
Build an Interactive Machine Learning Model with Shiny and Flexdashboard
       
Create Dashboard LayoutInitialize an empty markdown from R Studio using File > New File > R markdown > Create Empty Document. Create Reactive ExpressionTo limit re-running the same function repeatedly in different chunks of code, we use a reactive expression. The next time we run the reactive expression, it returns the saved value without any computation. User Outputs5.1 Add Linear Regression OutputUse a simple renderPrint function to render and print a linear model summary output. Style with CSSThere are several ways to add CSS to the flexdashboard.
Run search engine experiments in Vespa from python
       
Run search engine experiments in Vespa from pythonpyvespa provides a python API to Vespa. You can build and deploy a Vespa application using pyvespa API. Connect to a running Vespa applicationIn case you already have a Vespa application running somewhere, you can directly instantiate the Vespa class with the appropriate endpoint. There is also the possibility to explicitly export app_package to Vespa configuration files (without deploying them) through the export_application_package method:vespa_docker.export_application_package(application_package=app_package)Deploy from Vespa config filespyvespa API provides a subset of the functionality available in Vespa . But even in this case, you can still get value out of pyvespa by deploying it from python based on the Vespa configuration files stored on disk.
Towards the end of deep learning and the beginning of AGI
       
Towards the end of deep learning and the beginning of AGIHow recent neuroscience research points the way towards defeating adversarial examples and achieving a more resilient, consistent and flexible form of artificial intelligence Javier Ideami Just now·14 min readPainting by the author Javier Ideami@ideami.comPainting by the author Javier Ideami@ideami.comAdversarial examples are a hot research topic in deep learning nowadays. This fits nicely with Mountcastle’s idea of a single circuit that gets replicated many, many times (volume matters, but is volume enough to push today’s deep learning systems towards AGI? It is time to return to the adversarial examples and the status of the deep learning field. Just as when we use ensembling in deep learning, we are betting on thousands of angles on the problem, not just one. As Jeff points out, deep learning leaders like Jeff Hinton have already been working for quite some time in trying to make deep learning models more flexible (see capsule networks).
Supercharging NumPy with Numba
       
You become dependent on NumPy functions as it is very difficult to write optimal custom NumPy ufuncs (universal functions). Add decorators to instruct Numba to JIT compile your functions Add annotated types wherever Numba requires it Replace unsupported NumPy functions with supported functionsHow to install Numba? conda install numba # anacondapip install numba # pipHow does a Numba decorated function work? When a nopython function is passed to Numba with a Python object number convert it into a Numba object. While code is transitioning back to Python interpreter, Numba converts the Numba object back to the Python object.
Bias and variance, but what are they really?
       
Bias and variance, but what are they really? When a model has high bias, its predictions are consistently off, at least for certain regions of the data if not the whole range. Can you guess which friend has high bias and which one has high variance? In a broader context, when we say something has high variance, it means it acts erratically and its behavior is hard to predict. For example, many data scientists use the words test data and validation data interchangeably and assume that it’s clear from context.
Word2Vec Research Paper Explained
       
Word2Vec Research Paper ExplainedWe know what is Word2Vec and how word vectors are used in NLP tasks but do we really know how they are trained and what were the previous approaches for training word vectors. Well, here is an attempt to explain my understanding about the Word2Vec research paper [T. Mikolov et al.]. Word2Vec ModelsThis section introduces models used for training word2vec. Examples of learned relationshipsRelationship vector (“R”) between words “A” and “B” is obtained by subtracting word vector (“A”) from word vector (“B”). Resources[1] Original research paper- Efficient Estimation of Word Representations in Vector Space: https://arxiv.org/pdf/1301.3781.pdf[2] A Neural Probabilistic Language Model research paper : https://ai.googleblog.com/2016/06/wide-deep-learning-better-together-with.html[3] Recurrent Neural Network based Language Model research paper: https://www.isca-speech.org/archive/archive_papers/interspeech_2010/i10_1045.pdf[4] Hierarchical version of softmax research paper: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.221.8829&rep=rep1&type=pdf#page=255
Will Data Go Cloud Native?
       
Will Data Go Cloud Native? One thing I wonder, though, is whether cloud native data tools are going to become dominant. For a variety of reasons, people have come to use the term “cloud native” for two very different things. “Cloud native” means using the cloud service provider “native” tools. Cloud Native Data ToolsLet’s get real: data tools run on cloud native.
Neural Machine Translation: Inner Workings, Seq2Seq, and Transformers
       
Neural Machine Translation: Inner Workings, Seq2Seq, and TransformersSequence-to-sequence models, as oppose to traditional MT models, are able to map the relation between languages with the help of a parallel corpus. Azad Yaşar Feb 21·16 min readAttention map of a translationIntroductionRecently, I had a chance to work with the Neural Machine Translation (NMT) architectures for a term project. There are statistical machine translation methods that try to find the translation model. But this post aims to explain how neural machine translation tackles this problem. Neural Machine Translation and Sequence-to-sequenceIn 2014, Sutskever et al.
Entity Embeddings for ML
       
Why Entity Embeddings? For instance, when using word embeddings (which are essentially the same as entity embeddings) to represent each category, a perfect set of embeddings would hold the relationship: king - queen = husband - wife. Performance (mean absolute percent error — lower is better) of algorithms not using and using Entity Embeddings to represent categorical variables on the Rossman dataset (Guo 2016). Entity embeddings can represent categorical variables in a continuous way, retaining the relationship between different data values and thereby facilitating the model’s training. “Entity Embeddings of Categorical Variables”.
A Simple and Scalable Clustering Algorithm for Data Summarization
       
A Simple and Scalable Clustering Algorithm for Data SummarizationIntroductionIn the era of big data, the need for designing efficient procedures that summarize millions of data points in a meaningful way is more pressing than ever. More formally, given a set C of k centers, there are three main objectives that are usually studied. Set Z = P - {p}; S = S + {p};4. Let s(1), …, s(k) be the selected centers of the Gonzalez heuristic, given in the order the algorithm selected them. Since s(i) and s(t) belong to the same cluster C(j), by the triangle inequality this means thatD(s(i), s(t)) ≤ D(s(i), c(j)) + D(c(j), s(t)) ≤ OPT + OPT = 2 OPT.
Microsoft Azure Synapse Analytics Workspace vs. Snowflake Data Cloud
       
However, Azure Synapse requires additional query store setup or DMV (Dynamic Management View) to view past queries ran in Synapse. Synapse : End-to-End data pipeline phases under one integrated unified workspace like Data Ingestion, Data Transformation, Orchestration, Data Analytics, Administrative monitoring and Source control. Snowflake : Snowflake UI does not connect to any source control system directly. Synapse : Azure Purview provides data lineage and data catalog for Data in Synapse SQL Pool. DBT comes with the inbuilt data dictionary, data lineage, and many other features needed for Modern Data Warehouse.
How to Use MongoDB to Store and Retrieve ML Models
       
Store ML model into MongoDBFirst, you need to import the required libraries and initialize a few variables such as MONGO_HOST, MONGO_PORT, and MONGO_DB. Finally, the put() method of the GridFS object is used to store the model into MongoDB. Note that the put() method returns the ObjectId which is then used when retrieving the model from MongoDB. Retrieve ML model from MongoDBTo retrieve the ML model from MongoDB, you need to follow the same steps as earlier except for the last one. You need to pass the ObjectId of the file (model) you want to retrieve from MongoDB to the get() method of GridFS.
Finding Time-shift Between Two Timeseries for Maximum Correlation
       
Several engineering and science applications deal with timeseries data. in the form of timeseries data. For timeseries data evenly spaced on the time-axis, Pearson’s and Spearman’s correlation or the cross-correlation function provide means to calculate lag and eventually determine the right time-shift. However, things are not so straightforward when two timeseries are not evenly-spaced and the sampling time of two timeseries differ. Strym recognizes any pandas data frame as a timeseries data if the data frame has at least two columns with column names as Time and Message.
Genetic Algorithm — Stop Overfitting Trading Strategies
       
Overfitting in Genetic AlgorithmsBear with me, this is not yet another paraphrased definition of overfitting. In recent years, a technique called Random Subset Selection (RSS) has been developed to both speed up the training process of a Genetic Algorithm, and to reduce overfitting of it. Extending RSS: Coefficient of VariationThe core concept of RSS is to find solutions that are consistent across slices of the dataset, rather than just the entirety of it. CV is also called deviation risk measure in financial mathematics and unitized risk in actuarial sciences. The higher the coefficient of variation, the higher the deviation and risk a trading strategy bearsThe formula for Coefficient of Variation — Image by Author
Uber AresDB is an Open Source, GPU-Powered Database for Large-Scale Analytics Workloads
       
Uber AresDB is an Open Source, GPU-Powered Database for Large-Scale Analytics WorkloadsI recently started an AI-focused educational newsletter, that already has over 70,000 subscribers. AresDB is a database and runtime for massively scalable, real time analytics workloads. Scaling with GPUsThe idea of leveraging GPUs for scaling real time analytics workloads seems like a perfect fit for Uber. However, when processing a real time query, AresDB switches computations from the host memory to GPU memory to achieve faster performance. In AresDB, data ingestion is abstracted by an HTTP API that can receive a batch of upsert records.
Fully Explained SVM Classification with Python
       
Fully Explained SVM Classification with PythonSVM Classification. But, we can choose different criteria to divide the classes by choosing the different kernel function parameters given in the classification classes in SVM for the decision points. GammaThe gamma value is considered to make a smooth hyperplane by checking the distance from the hyperplane to data points. The Low value of gamma means it check the near data points distance and the large value of gamma means it measures far data points distance also from the hyperplane. Conclusion:The SVM classification is very useful in classification and regression.
NLP using Deep Learning Tutorials: Understand the Activation Function
       
Activation functions are introduced in neural networks to capture complex relationships in data. There are many types of activation functions. In the following section, I will expose five of the most used activation functions in NLP. SigmoidThe sigmoid is one of the earliest used activation functions. Thereby, the two activation functions share also the same problems of vanishing and exploding gradient.
Must-have Chrome Extensions For Machine Learning Engineers And Data Scientists
       
Browser extensions are the secret weapons that most hackers and developers keep in their arsenal to be more productive. Since a good portion of machine learners use chrome (given chrome’s massive market share), I’ve compiled a list of must-have Chrome extensions for machine learning engineers and data scientists. Source: Image by Himanshu RagtahSource: Image by Himanshu RagtahClick here to install the Open in Colab Chrome ExtensionSource: image by open in colab at chrome webstoreThis extension has over 20,000 installs at the moment! A machine learning engineer specializing in NLP can use this extract reviews and ratings from a given site like yelp. Source: image by instant data scraperSource: image by instant data scraperOrIf you’re trying to scrape listings of wheel sellers in the Bay Area from Craigslist.
How to Build a Prometheus Exporter for Sensor Monitoring
       
Code the Prometheus Exporter for the SensorManage the configurationThe configuration is a YAML file and has different location following the priority:/etc/dht-prometheus-exporter.yml$HOME/dht-prometheus-exporter.yml$PWD/dht-prometheus-exporter.ymlThe code below parses the configuration file using the viper package. This allows this information to be usable in other files of the project:Deal with the loggingThe log.go file configures the logrus logger. It reuses the default log level present in the configuration:Interface with the sensorTo interface with the sensor, we use the go-dht package. The code below stores information in a struct and wraps the package functions to add some extras:Collect the metricsThe Prometheus package provides the collector concept. It lists the different metrics to collect and expose in the exporter.
Shapash: Making ML Models Understandable by Everyone
       
Shapash: Making ML Models Understandable by EveryoneIn this article, we will present Shapash, an open-source python library that helps Data Scientists to make their Machine Learning models more transparent and understandable by all! It makes it easier to share and discuss the model interpretability with non-data specialists: business analysts, managers, end-users…Concretely, Shapash provides easy-to-read visualizations and a web app. A Web App is interesting at this phase because they need to look at visualizations and graphics. Step 4 — Launching the Web Appapp = xpl.run_app()The web app link appears in Jupyter output (access the demo here). This web app is a useful tool to discuss with business analysts the best way to summarize the explainability to meet operational needs.
Deploy a Python Machine Learning Model on your iPhone
       
It includes:Stepper structs (lines 19–30) for each of our three features, which enable users to modify feature values. structs (lines 19–30) for each of our three features, which enable users to modify feature values. A Button on the navigation bar (lines 31–40) to call our model from within the predictPrice function (line 46). Outside of the NavigationView we have our predictPrice function (lines 46–62). The predictPrice function instantiates our Swift Core ML model class and generates a prediction according the values stored in our feature states.
K-Nearest Neighbors (KNN) Algorithm Tutorial — Machine Learning Basics
       
The k-nearest neighbor algorithm, commonly known as the KNN algorithm, is a simple yet effective classification and regression supervised machine learning algorithm. What is the K-Nearest Neighbors (KNN) Algorithm? The KNN algorithm is a major classical machine learning algorithm that focuses on the distance from new unclassified/unlabeled data points to existing classified/labeled data points. Figure 4: Our example introduces a new data point to illustrate the k-nearest neighbor (KNN) algorithm. Implementation of KNN in PythonFor the implementation of the KNN algorithm, we will be using the Iris dataset.
How to Learn Deep Learning for Beginners ?
       
Perhaps the most well-known resource for learning deep learning is Andrew Ng’s series of 5 courses on Coursera. Before I try to convince you to start your deep learning journey from there, here is a brief description of the course itself. This is the MIT’s “Introduction to Deep Learning” (MIT 6.S191) and it is now in it’s 5th year. MIT has been exceptionally generous to have all the material available online for free, for anyone to learn from. While these courses will be released with one lecture a week, these were delivered virtually over 2 weeks for MIT students starting from 18th January 2021 (so they are quite the latest you will get to hear about deep learning!).
The Ultimate Guide to Acing Machine Learning Interviews for Data Scientists and Machine Learning Engineers
       
Project-Based Machine Learning QuestionsMachine Learning BasicsMachine learning basics are commonly asked in both technical phone screens and onsite interviews to get a quick assessment of a candidate’s basic machine learning knowledge. Machine Learning CodingPhoto by ThisisEngineering RAEng on UnsplashThe second type of question is the machine learning coding question. As this great blog post points out, the most commonly asked algorithms are:Supervised Learning:Linear regressionLogistic RegressionK-nearest NeighborsDecision TreeUnsupervised Learning:K-means ClusteringHow to Answer Machine Learning Coding QuestionsAnswering machine learning coding questions is similar to generic coding questions. How to Prepare for Applied Machine Learning QuestionsWhen preparing for applied machine learning questions, you will need to prepare differently for generic versus domain-specific questions. Traditional matrix factorization solution:Deep learning approaches:Project-Based Machine Learning QuestionsPhoto by Van Tay Media on UnsplashLike applied machine learning questions, the purpose of project-based questions is also to assess the level of expertise of a candidate.
Learn State-of-the-art Deep Learning Directly from MIT for Free and More!
       
Please don’t forget that you can access this work, many more books, and other goodies by becoming a member. This talk by Stanford Professor Percy Liang highlights taming deep models and shaping their development with two novel approaches. While those concepts are important to master to ace machine learning interviews, you may feel underprepared and are often caught off-guard during interviews when you are only prepared to solve those problems. The truth is that machine learning interviews are more comprehensive than just a Q&A of basic machine learning concepts. Machine learning interviews evaluate a candidate’s capacity to work with a team to solve complex real-world problems using machine learning methodologies.
Google AI Blog: Contactless Sleep Sensing in Nest Hub
       
We extended this technology and developed an embedded Soli-based algorithm that could be implemented in Nest Hub for sleep tracking. However, to understand and improve their sleep, users also need to understand why their sleep is disrupted. The Nest Hub displays when snoring and coughing may have disturbed a user’s sleep (top) and can track weekly trends (bottom). Another special thanks to Ken Mixter for his support and contributions to the development and integration of this technology into Nest Hub. Thanks to Mark Malhotra and Shwetak Patel for their ongoing leadership, as well as the Nest, Fit, Soli, and Assistant teams we collaborated with to build and validate Sleep Sensing on Nest Hub.
Real Time Digit Recognition in IOS
       
Real Time Digit Recognition in IOSDemo gif by AuthorThe future of computing involves computers that can see like we see. After the first transformation we get tensors in the shape of (1, 28, 28) . at::Tensor tensor =torch::from_blob(imageBuffer, {1, 1, 28, 28}, at::kFloat);The second argument, {1, 1, 28, 28} , indicates the size of the tensor we want to create. torchvision.transforms.Normalize((0.1307,), (0.3081,))To emphasis it again; we want the production input data to be as close as possible to the training data. var rawBytes: [UInt8] =[UInt8](repeating: 0, count: 28 * 28) // 1 bytes per pixel;There are a couple of things to notice.
The Concepts Behind Logistic Regression
       
Cost FunctionWhy not least squares as cost function? In logistic regression, the actual y value will be 0 or 1. The predicted y value ŷ will be between 0 and 1. This is one of the reasons, least squares is not used as a cost function for logistic regression. Error= -{y ln ŷ + (1-y) ln (1-ŷ)}→ y is either o or 1→ ŷ is always between 0 and 1→ ln ŷ is negative and ln (1-ŷ) is negative→ negative sign before the expression is included to make the error positive [ In linear regression least-squares method, we will be squaring the error]So, the error will be always greater than or equal to zero.
Developing Scorecards in Python using OptBinning
       
Undestanding OptBinning ClassesOptBinning has 3 main class types hierarchically related that perform all the processing needed to bin your features and create a Scorecard. OptimalBinning, ContinuousOptimalBinning, and MulticlassOptimalBinningOptimalBinning is the base class for performing binning of a feature with a binary target. The usage is fairly simple, with just a few parameters needed for performing the binning of a full dataset. ScoreCardThe class ScoreCard offers the possibility of combining the binned dataset generated from a BinningProcess with a linear estimator from Scikit-Learn to generate a production-ready Scorecard. Figure 1 summarizes the relationship of classes that are part of OptBinning.
Implementing Single Shot Detector (SSD) in Keras: Part IV — Data Augmentation
       
Implementing Single Shot Detector (SSD) in Keras: Part IV — Data AugmentationEdited by Author. Hence, this article focuses on Photometric Augmentation and Geometric Augmentation which are the data augmentation techniques mentioned in the SSD paper. This is done by shifting each original pixel values (r, g, b) into a new pixel values (r′, g′, b′) (Taylor & Nitschke, 2015). The remaining of this section will discuss: Random Brightness, Random Contrast, Random Hue, Random Lighting Noise, and Random Saturation. Data Augmentation in SSD (Single Shot Detector).
Causal Inference in Data Science: Valid Inferential Coverage with Multiple Comparisons
       
Let us motivate the Multiple Testing Problem with a simple example:Recall that in a hypothesis testing framework a type-I error occurs when a null hypothesis is rejected and alternative accepted, but in the case where the null hypothesis is in fact true. This phenomenon becomes progressively worse as the number of individual hypothesis tests under study grows. We simply keep conducting more and more tests until we “find” a statistically significant result and deem it an “important discovery”. Some of our control techniques are valid under arbitrary dependence structures, while others are only valid under independent or positively correlated tests. Therefore, if we are are exclusively conducting two-sided tests, the possibility of negatively correlated tests is not of particular concern.
How to know which Statistical Test to use for Hypothesis Testing?
       
How to know which Statistical Test to use for Hypothesis Testing? One sample Test vs Two sample Test:Image by stux from PixabayOne sample test is a statistical procedure considering the analysis of one column or feature. One-Sample Test:As discussed above, a one-sample test involves hypothesis testing of one random variable. Two-Sample Test:In hypothesis testing, a two-sample test is performed on the data of two random variables, each obtained from an independent population. Statistical Test between Two Categorical variables:Chi-squared Test:When your experiment is trying to draw a comparison or find the difference between the two categorical random variables, then you can use the chi-square test, to test the statistical difference.
Gradient Kernel Regression
       
Gradient kernel regression, because it can be constructed without the hard work of training by gradient descent, can serve as a useful tool for model architecture design. We are interested in examining how a kernel regression using the gradient kernel performs as the parameters w are modified by gradient descent applied to the network. Unfortunately, gradient descent drives the gradient kernel into an ill-conditioned and numerically unstable regime. Gradient kernel regression can be used to explore the possible forms of this final layer efficiently. Gradient kernel regression provides a mechanism for testing the performance of a network without going through the gradient descent training process.
An Introduction to Time Series Analysis with ARIMA
       
Examples of time series data include S&P 500 Index, disease rates, mortality rates, blood pressure tracking, global temperatures. This post will be looking at how the autoregressive integrated moving average(ARIMA) models work and are fitted to time series data. A time series is white noise(random) if the variables are independent and identically distributed(i.i.d) with a mean of zero. If d=0: yₜ = YₜIf d=1: yₜ = Yₜ — Yₜ₋₁If d=2: yₜ = (Yₜ — Yₜ-₁) -(Yₜ-₁ — Yₜ-₂)Now let’s consider ARIMA(1,1,1) for the time series x. SummaryArima is a great tool for time series analysis, and Auto Arima packages make the process of fine-tuning a lot easier Always plot your data and perform Explanatory Data analysis EDA in order to get a better understanding of the data.
Assessment of the impact of AI on carbon emissions and possible mitigations
       
These questions revolve around how AI technology can help the carbon footprint issue, which is usually overcast by its accuracy factor, which is also the primary reason for developing AI. A study conducted in the previous year reveals that training an off-the-shelf AI language-processing system generate One thousand four hundred pounds of carbon emissions. The consumption of resources to produce the top-notch AI model just got doubled every 3,4 months. In contrast, the previous AI model GPT-2 was trained on a dataset of 40 billion words to gain accuracy [3]. We have to learn in a more general sense that any single projections don’t solely rely on AI technology.
Stop using Image Interpolation for Neural Audio Synthesis
       
Stop using Image Interpolation for Neural Audio SynthesisIn this story I want to advance your current understanding of neural upsamplers in the context of audio synthesis. The same holds true for audio synthesis using popular architectures like GANs, U-Nets or Auto-encoder. Upsampling ArtifactsAll the above mentioned upsampling methods will introduce artifacts into a neural audio synthesis model, however, in theory your model could decide to learn a way to minimize these artifacts. They achieve better SDR scores and are computational more efficient than using image interpolation methods. This is great since it keeps our model simple and performant, your training times should improve significantly if you were using interpolation methods before!
Demand forecast with different data science approaches
       
Demand forecast with different data science approachesPhoto by Jess Bailey on UnsplashIn this story, I would like to make an overview of common data science techniques and frameworks to create a demand forecast model. Demand Forecasting is the pivotal business process around which strategic and operational plans of a company are devised. Ok, let's solve this problem and try to use different data science techniques and frameworks to make an accurate demand forecast. The first task when initiating the demand forecasting project is to provide the client with meaningful insights. The main idea in using machine learning models for demand forecast is to generate a lot of useful features.
Multilingual CLIP with Huggingface + PyTorch Lightning ? ⚡
       
openAI CLIP algorithm (taken from official blog)This is a walkthrough of training CLIP by OpenAI. CLIP was designed to put both images and text into a new projected space such that they can map to each other by simply looking at dot products. This method allows you to map text to images, but can also be used to map images to text if the need arises. This particular blog however is specifically how we managed to train this on colab GPUs using huggingface transformers and pytorch lightning. In terms of which element is the true positive within a batch, remember that we are sending image, caption pairs already lined up.
Develop a Neural Network for Banknote Authentication
       
shape [ 1 ] # define model model = Sequential ( ) model . shape [ 1 ] # define model model = Sequential ( ) model . shape [ 1 ] # define model model = Sequential ( ) model . shape [ 1 ] # define model model = Sequential ( ) model . shape [ 1 ] # define model model = Sequential ( ) model .
3 Common Problems with Neural Network Initialization
       
Too-small Initialisation — Vanishing GradientWhy does it happen: Initialised weights of a neural network are too smallInitialised weights of a neural network are too small Result: Premature convergencePremature convergence Symptoms: Model performance improves very slowly during training. To better understand that, let’s assume that we have a three-layer fully connected neural network below where every layer has a sigmoid activation function with zero bias:Three-layer Fully Connected Neural Network — Source: Image by AuthorIf we quickly recall how a sigmoid function and its derivates look like, we can see that the maximum value of the derivative (i.e. Assuminga stands for the output of activation function,stands for the output of activation function, σ stands for the sigmoid function (i.e. However, the model has not converged yet; it is simply suffering from a vanishing gradient! Ways to alleviate vanishing gradient can be:
How to reconstruct an image if you see only a few pixels
       
Those making sense to human beings belongs to what we could call the set of natural images. In mathematical terms, we say that the natural image is sparse in the Fourier basis. Reconstructing an image from a handful of pixels, a massively underdetermined problemOnly 10% of the pixels are actually recorded. Our goal is thus to infer what the original image x is given that we observed only a few of its pixels. Denoting by Ψ the n × n matrix mapping from Fourier space to pixel space, our measurement equation becomeswhere s is the Fourier transform of x (i.e.
Rediscovering Unit Testing: Testing Capabilities of ML Models
       
Rediscovering Unit Testing: Testing Capabilities of ML ModelsTriggered by two papers [1, 2], I have been reading and thinking a lot recently about testing of machine-learned models. Traditional accuracy evaluations and their assumptionsTo understand the relevance of testing capabilities, it is important to understand the assumptions and limitations behind traditional accuracy evaluations. [6]Slicing test data: Finally, we can also search in a large pool of test data for instances of tests that are relevant to our capability, for example, all sentences that include negation. If we want to avoid sampling from the original test data, we can also collect previously-unlabeled data in production, identify potentially challenging cases, and then label those as test data for our capability. assumption just as testing capabilities does, but potentially with the same benefits.
Pool Detection from Aerial Imagery
       
Pool Detection from Aerial ImageryPhoto by CHUTTERSNAP on UnsplashTalk is cheap. — Linus TorvaldsThere’s a lot of talk about swimming pool detection from aerial imagery. I managed to find a government resource that gives you high quality aerial imagery. If you’re looking for free high resolution aerial imagery this is probably your best bet. ModelInitially I created the model using icevision.
Dr. Machine: Can it Diagnose COVID-19?
       
First and foremost, we need to import os, matplotlib, image from tensorflow.keras, cv2, imageio, and from PIL import Image. As that was done earlier, we will just work to sort them into two lists, train_imgs_normal for normal lung images and train_imgs_pneumonia for pneumonic images. print_images(normal_images, ‘Normal Images’)Image by AuthorAs you can see above, the normal lungs images are relatively clear with no obstructions. print_images(pneumonia_images, ‘Pneumonia Images’)Image by AuthorWe can see that pneumonic lungs are cloudier than normal lungs and are relatively obstructed when we try to view them. Apart from this, we also got insight into how using Machine Learning can help healthcare workers diagnose these illnesses faster!
How I published my PL SQL Program as REST API through Python
       
How I published my PL SQL Program as REST API through PythonPhoto by Fernando Brasil on UnsplashSudha is an IT professional with decades of experience in SQL, PLSQL, Databases, and ERP applications. In this article, I am explaining what is REST API and how a classical PLSQL program can be converted into REST API. In the case of client and server communication, the server assigns a unique signature to and informs clients. REST APINow that you understand how token-based communication works, it’s time to understand REST API. Suppose, you have a python program and you write http://www.medium.com and the program is directly connected to the server (step3).
Ensemble Models
       
Ensemble LearningBuilding ensemble models is not only focused on the variance of the algorithm used. Aggregating PredictionsWhen we ensemble multiple algorithms to adapt the prediction process to combine multiple models, we need an aggregating method. Therefore, it is very likely that ensemble models with no careful training process can quickly produce high overfitting models. Therefore, it is very likely that ensemble models with no careful training process can quickly produce high overfitting models. When deploying ensemble models into production, the amount of time needed to pass multiple models increases and could slow down the prediction tasks’ throughput.
Convolutional layer hacking with Python and Numpy
       
At this point we will have:Numpy input data : 1x3x130x130Pytorch input data : 1x3x128x128Notice that numpy data incorporates the padding whereas the pytorch data doesn’t because the pytorch convd2d layer will apply the padding by itself. At this point we will have: : 1x3x130x130 : 1x3x128x128 Notice that numpy data incorporates the padding whereas the pytorch data doesn’t because the pytorch convd2d layer will apply the padding by itself. These are for examples the ones of the numpy data structure:## Showing 2 decimals for visual clarityprint(np.around(w,decimals=2)) ###### FIRST KERNEL, detect horizontal lines #####[[[[-1.1 -1.1 -1.1 ][ 2.2 2.2 2.2 ][-1.1 -1.1 -1.1 ]] [[-1.1 -1.1 -1.1 ][ 2.2 2.2 2.2 ][-1.1 -1.1 -1.1 ]] [[-1.1 -1.1 -1.1 ][ 2.2 2.2 2.2 ][-1.1 -1.1 -1.1 ]]] ###### SECOND KERNEL, blurring #####[[[ 0.11 0.11 0.11][ 0.11 0.11 0.11][ 0.11 0.11 0.11]] [[ 0.11 0.11 0.11][ 0.11 0.11 0.11][ 0.11 0.11 0.11]] [[ 0.11 0.11 0.11][ 0.11 0.11 0.11][ 0.11 0.11 0.11]]] ###### THIRD KERNEL, sharpen #####[[[ 0. , we will add them to the calculations connected to each of the output channels (those calculations are stored in the variable for each of the output channels). pytorch_verify=ptorch_out[0][0][0].cpu().detach().numpy()pytorch_verify=np.around(pytorch_verify,decimals=3)numpy_verify=np.around(numpy_out[0][0][0],decimals=3) print("pytorch: ",pytorch_verify[:25])print("python: ",numpy_verify[:25])assert np.allclose(pytorch_verify, numpy_verify)pytorch: [0.978 0.997 0.997 0.997 0.997 0.997 0.997 0.997 0.997 0.997 0.997 0.997 0.997 0.997 0.997 0.997 0.997 0.997 0.997 0.997 0.997 0.997 0.997 0.997 0.997]python: [0.978 0.997 0.997 0.997 0.997 0.997 0.997 0.997 0.997 0.997 0.997 0.997 0.997 0.997 0.997 0.997 0.997 0.997 0.997 0.997 0.997 0.997 0.997 0.997 0.997]It’s looking good!
Using CNN Extracted Features In Fun New Ways
       
Using CNN Extracted Features In Fun New WaysUnpacking the features learned by a deep convolutional neural network (CNN) is a dauting task. Feature similarity heatmap block4 — Images by Everingham et al. Let’s see how well this works using features from block 4, this time using a ResNet-50 model:Ad hoc segmentation using ResNet-50 block 4 features — Images by Everingham et al. At such a low resolution grid of just 16x16, the resulting mask captures too much of the region around the plane, so let’s see if we can fix this by using block 3 features:Ad hoc segmentation using ResNet-50 block 3 features — Images by Everingham et al. using the same image in both the left and right panels to see if a model is learning semantically meaningful features during the course of training.
A Neural Algorithm of Artistic Style: Summary and Implementation
       
A Neural Algorithm of Artistic Style: Summary and ImplementationNeural-style, or Neural-Transfer, allows reproducing a given image with a new artistic style. The algorithm receives a style-image, a content-image and an input image, which can be either an empty white image or a copy of a content-image. Style LossIn order to calculate the style loss, the authors use Gram matrices. Finally, the style loss is given by the squared-error loss between style and input gram matrices. ResultsThe figure above shows the content and the style images as well as the generated input image.
Implementation of Semi-Supervised Generative Adversarial Networks in Keras
       
Implementation of Semi-Supervised Generative Adversarial Networks in KerasPhoto by mohammad alizade on UnsplashEveryone has heard about supervised learning and unsupervised learning but there is also another set of learning techniques in between them called semi-supervised learning. Generative Adversarial Networks (known as GAN’s) are a class of generative models designed by Ian Goodfellow and his colleagues in 2014. Its job is not only to distinguish between Real/Fake Images but also to classify the Labeled Training Images into their correct classes. Generator NetworkGenerator NetworkUsing Keras Sequential API for building our generator model. Fake Images Generated by Generator, Source: Image by AuthorSo, these are the fake images generated by the generator which are quite realistic with few exceptions.
Explainable AI (XAI) design for unsupervised deep anomaly detector
       
Explainable AI (XAI) design for unsupervised deep anomaly detectorAn interpretable prototype of unsupervised deep convolutional neural network & lstm autoencoders based real-time anomaly detection from high-dimensional heterogeneous/homogeneous time series multi-sensor dataHello, friends. The basic idea is that the LSTM network has multiple “gates” inside of it with trained parameters. The idea is to convert two-dimensional data set of the dimension from [Batch Size, Features] to three-dimensional data set [Batch Size, Lookback Size, Features]. Anamoly.compute(X, Y, LOOKBACK_SIZE=10, num_of_numerical_features=26, MODEL_SELECTED=MODEL_SELECTED, KERNEL_SIZE=KERNEL_SIZE, epocs=30)================== Training Loss: 0.2189370188678473 - Epoch: 1Training Loss: 0.18122351250783636 - Epoch: 2Training Loss: 0.09276176958476466 - Epoch: 3Training Loss: 0.04396845106961693 - Epoch: 4Training Loss: 0.03315385463795454 - Epoch: 5Training Loss: 0.027696743746250377 - Epoch: 6Training Loss: 0.024318942805264566 - Epoch: 7Training Loss: 0.021794179179027335 - Epoch: 8Training Loss: 0.019968783528812286 - Epoch: 9Training Loss: 0.0185430530715746 - Epoch: 10Training Loss: 0.01731374272046384 - Epoch: 11Training Loss: 0.016200231966590112 - Epoch: 12Training Loss: 0.015432962290901867 - Epoch: 13Training Loss: 0.014561152689542462 - Epoch: 14Training Loss: 0.013974714691690522 - Epoch: 15Training Loss: 0.013378228182289321 - Epoch: 16Training Loss: 0.012861106097943028 - Epoch: 17Training Loss: 0.012339938251426095 - Epoch: 18Training Loss: 0.011948177564954476 - Epoch: 19Training Loss: 0.011574006228333366 - Epoch: 20Training Loss: 0.011185694509874397 - Epoch: 21Training Loss: 0.010946418002639517 - Epoch: 22Training Loss: 0.010724217305010896 - Epoch: 23Training Loss: 0.010427865211985524 - Epoch: 24Training Loss: 0.010206768034701313 - Epoch: 25Training Loss: 0.009942568653453904 - Epoch: 26Training Loss: 0.009779498535478721 - Epoch: 27Training Loss: 0.00969111187656911 - Epoch: 28Training Loss: 0.009527427295318766 - Epoch:
Automating Machine Learning Model Optimization
       
Automating Machine Learning Model OptimizationPhoto by Hunter Harritt on UnsplashCreating a machine learning model is a difficult task because we need to make a model which works best for our data and we can optimize for better performance and accuracy. Generally making a machine learning model is easy but find out the best parameters and optimizing is a time taking process. There are certain libraries/packages which allow you to automate this process and make machine learning models effortlessly. We can use these packages to select the best model for our data and also the best parameters for the model. best_proposalBest Model(Source: By Author)This is how you can use BTB for selecting the best performing machine learning model with the best hyperparameter values.
Segmenting Abnormalities in Mammograms (Part 3 of 3)
       
It comprises two main parts — the encoder block (top half of the diagram) and the decoder block (bottom half of the diagram). Decoder block: It is a network of upsampling convolutions and concatenation layers that is symmetric to the encoder block. 1, selected feature maps from the encoder block are concatenated to their corresponding layers in the decoder block. We have to bear in mind that only the appropriate image augmentations can help with our model’s learning! 3 Examples of image augmentations (random rotation, shear, width shifting and height shifting) that do not apply to our use case.
17 types of similarity and dissimilarity measures used in data science.
       
However, before going any further, let’s explain how we can use the euclidean distance in the context of machine learning. Each data point came along with its own label: Iris-Setosa or Iris-versicolor(0 and 1 in the dataset). Until this point, everything looks great, and our KNN classifier is ready to classify a new data point. Therefore, we need a way to let the model decide where the new data point can be classified. Aside from that, the Manhattan distance would be preferred over the Euclidean distance if the data present many outliers.
Understanding the Amazon Rainforest with Multi-Label Classification + VGG-19, Inceptionv3, AlexNet & Transfer Learning
       
ClassificationThis portion of the Convolutional Neural Network is responsible for the classification of the images into one or more of the predetermined classes. Layers of the Convolutional Neural Network that are used here:Fully connected layer → The final layer in a convolutional neural network. Curious about which convolutional neural network architectures work best on satellite imagery, I decided to experiment with some common convolutional neural network architectures and compare the performances of each model. Keras offers deep learning image classification models with weights that were pre-trained available for transfer learning usage. Popular convolutional neural network architectures such as Xception, VGG16, VGG19, ResNet50, Inceptionv3, DenseNet121, EfficientNetB7, and others are available to be used.
A Complete Logistic Regression Algorithm for Image Classification in Python From Scratch
       
A Complete Logistic Regression Algorithm for Image Classification in Python From ScratchLogistic regression is very popular in machine learning and statistics. This article will be focused on image classification with logistic regression. This article has a detailed explanation of how a simple logistic regression algorithm works. Let’s see how cost function changed with each updated ‘w’s and ‘b’s:plt.figure(figsize=(7,5))plt.scatter(x = range(len(d['costs'])), y = d['costs'], color='black')plt.title('Scatter Plot of Cost Functions', fontsize=18)plt.ylabel('Costs', fontsize=12)plt.show()Look, with each iteration, the cost function went down as it should. ConclusionIf you could run all the code and could understand most of it, you just learned how a logistic regression works!
Clustering Project? That’s CUTE.
       
Clustering often starts as an innocent act; for example, a product manager is determined to discover who their product’s users are. Unfortunately, this translates into clustering methods falling victim to extensive misuse. This begs the question: if the chance of deriving any benefits from generic clustering methods is low, why do most companies focus on it so heavily in the first place? Image by AuthorWhat does this have to do with clustering methods? Instead of attempting to map recommendations one-to-one, the usual suggestion is to group users into a small number of buckets.
How we learnt to love the rotationally invariant variational autoencoders (rVAE), and (almost) stopped doing PCA
       
However, in many cases, the scientists analyzing imaging data are interested in specific shapes. More complex versions of (r)VAE including joint (r)VAE, semi-supervised (r)VAE, and (r)VAE augmented with normalizing flows allowing for the unknown and partially known classes will be discussed later. Here, we derive the latent space and distribution of points in the latent space colored by angle and digit. Alternatively, the latent space can be sampled by a rectangular grid of points and the corresponding images can be decoded and plotted as a latent space representation. This behavior is linked to a rather fundamental aspect of physics, namely the topological structure of the latent space and data space.
Is Data Science a science?
       
Is Data Science a science? At its core, all fundamental science is about making predictions in the form of experiments: precise, quantifiable, falsifiable predictions. The examination of basic science has revealed an intimate relationship between such fundamental science and computation. As it happens, an important component of fundamental science deals with those problems in the universe that are computable. So how does this relate to Data Science?
3 Examples to Show Python Altair is More Than a Data Visualization Library
       
3 Examples to Show Python Altair is More Than a Data Visualization LibraryPhoto by Sigmund on UnsplashAltair is a statistical data visualization library for Python. It provides flexible and versatile methods to transform and filter data while creating a data visualization. In that sense, Altair can also be considered as a data analysis tool as well. We will go over 3 examples that demonstrate how Altair expedite the exploratory data analysis process. import numpy as npimport pandas as pdimport altair as alt cols = ['Type','Price','Distance','Date','Landsize','Regionname'] melb = pd.read_csv("/content/melb_data.csv", usecols = cols, parse_dates = ['Date']).sample(n=1000).reset_index(drop=True) melb.head()
Validation Curve Explained — Plot the influence of a single hyperparameter
       
Validation Curve Explained — Plot the influence of a single hyperparameterPhoto by CHUTTERSNAP on UnsplashIn Machine Learning (ML), model validation is used to measure the effectiveness of an ML model. This is where you plot the validation curve. The validation curve is a graphical technique that can be used to measure the influence of a single hyperparameter. This is because, today, we build a random forest model and plot the validation curve based on it. (Code by author)Let’s interpret the validation curveNow, we interpret the validation curve that we plotted previously.
Non-Linear Augmentations For Deep Learning
       
In this post, I am covering two augmentations for image data that are not very popular in deep learning but may be useful for extending your dataset. Horizontal Wave TransformationHorizontal wave transformation is another non-linear transformation that distorts the pixels in the shape of the horizontal cosine wave of a given amplitude and frequency. Horizontal Wave Transformation: Result(right) of Horizontal WaveTransformation applied on a checkerboard image(left). Horizontal wave transformation transforms the given image in the shape of the cosine wave of given amplitude and frequency. Mathematically, the horizontal wave transformation function for a given pixel (x,y) is given by the following formula.
Google AI Blog: LEAF: A Learnable Frontend for Audio Classification
       
However, unlike computer vision models, which can learn from raw pixels, deep neural networks for audio classification are rarely trained from raw audio waveforms. As a consequence, standard mel filterbanks are used for most audio classification tasks in practice, even though they are suboptimal. In “LEAF, A Fully Learnable Frontend for Audio Classification”, accepted at ICLR 2021, we present an alternative method for crafting learnable spectrograms for audio understanding tasks. PerformanceWe apply LEAF to diverse audio classification tasks, including recognizing speech commands, speaker identification, acoustic scene recognition, identifying musical instruments, and finding birdsongs. D-prime score (the higher the better) of LEAF, mel filterbanks and previously proposed learnable spectrograms on the evaluation set of AudioSet.
Learning from videos to understand the world
       
Today, we’re announcing a project called Learning from Videos, designed to automatically learn audio, textual, and visual representations from the data in publicly available videos uploaded to Facebook. Today, we are announcing a new project called Learning from Videos, designed to learn from audio, visual, and textual input — to continuously improve our core systems and power entirely new applications. Our latest technique for learning speech representations, called wav2vec 2.0, works by first masking a portion of the speech and then learning to predict masked speech units. It’ll be valuable to build smarter AI systems that can understand what’s happening in videos on a more granular level. Our Learning from Videos project signals a paradigm shift in the way machines are able to understand videos, sending us on the path to build smarter AI systems.
TimeSformer: A new architecture for video understanding
       
What the research is:Facebook AI has built and is now sharing details about TimeSformer, an entirely new architecture for video understanding. Video classification accuracy of TimeSformer versus state-of-the-art 3D convolutional neural networks on the action recognition benchmarks of Kinetics-400 (left) and Kinetics-600 (right). The idea is to separately apply temporal attention and spatial attention, one after the other. Furthermore, we found that divided space-time attention is not only more efficient but also more accurate than joint space-time attention. READ THE FULL PAPER:Is space-time attention all you need for video understanding?
How Latent Space used the Amazon SageMaker model parallelism library to push the frontiers of large-scale transformers
       
Furthermore, even if we determine how we split our model, introducing model parallelism was a significant engineering task for us to do manually across our research and development lifecycle. The model parallelism library in SageMaker makes model parallelism more accessible by providing automated model splitting, also referred to as automated model partitioning and sophisticated pipeline run scheduling. Consider using the SageMaker distributed data parallel library. Selecting a model parallel approach for training When encountering out of memory errors during training Switch to a model parallel approach using the SageMaker distributed model parallel library. Consider using the SageMaker distributed data parallel library.
From forecasting demand to ordering – An automated machine learning approach with Amazon Forecast to decrease stockouts, excess inventory, and costs
       
The team followed a Kaizen approach, learning from previously unsuccessful models, and deploying models only when they were successful. Forecast demandIn this section, we discuss the steps of forecasting demand for each store-SKU combination. Forecasting processThe team tested multiple forecasting techniques like time series models, regression-based models, and deep learning models before choosing Forecast. We recommend that you try Amazon Forecast to improve your supply chain operations. You can learn more about Amazon Forecast here.
Storage & Compute for Machine Learning
       
Storage & Compute for Machine LearningPhoto by Vitor Pinto on UnsplashOver the past two articles we covered the various activities involved with data collection and storage. These resources default to our on-premises GPUs, but we are also able to burst by firing up GPU instances on AWS. Finally, we leverage BI tools to assess the relationship between our training data, stored in S3 and Redshift with model performance. This analysis is also invaluable in helping us figure out gaps in our training data. The bigger the overlap between our training data and the overall data (i.e.
Linear Regression With Bootstrapping
       
Linear Regression With BootstrappingBootstrapping Linear Regression | Photo by Ahmad DiriniThis article builds on my Linear Regression and Bootstrap Resampling pieces. The following content helps to explain two types of bootstrapping as applied to linear regression models! Python Jupyter notebook exampleWe treat the data sample we have as the only representation of the population that we have. To fight this, we can apply a different type of bootstrapping, called ‘non-parametric bootstrapping’ whereby we apply bootstrapping on the residuals, and not the parameter itself. In this article we looked at applying bootstrapping techniques to linear regression in two ways:Parametric bootstrapping — resampling from all of the points:Sample the data with replacement numerous times (100) Fit a linear regression to each sample Store the coefficients (intercept and slopes) Plot a histogram of the parametersNon-parametric boostrapping — resampling on the residuals with an uneven distribution of feature values:Find the optimal linear regression on all the original data Extract the residuals from the fit Create new y-values using the residual samples Fit the linear regression with the new y-values Store the slope and intercepts Plot a histogram of the parametersLike Ulysses himself in his “Odessy” — I hope I have created a simple narrative that helps illustrate bootstrap resampling in the context of linear regressions.
“DAG Card” is the new “Model Card”
       
“DAG Card” Is the New “Model Card”Introduction“Imitation is the sincerest form of flattery” — O. Wilde (possibly imitating someone else)Software is eating the world, and A.I. experts struggle to understand a model built by somebody else, especially if it is not a rarefied model from the literature, but an actual API serving millions of requests a day. We liked the idea so much that we decided to hack together ourselves a Metaflow card generator, self-documenting ML pipelines from code and comments. To get a feeling of what we are building, this is a screenshot of our sample DAG card:A glimpse of a DAG card, illustrating owners, tasks, input files and parameters. The accompanying code shows how to programmatically generate a card like this from a Metaflow class [ screenshot by the author — original Google card here ].
Learning from Audio: Time Domain Features
       
Learning from Audio: Time Domain FeaturesIntroductionWhile Deep Learning often utilizes processes in the frequency domain, there are still many relevant features to be leveraged within the time domain that are relevant to many Machine Learning techniques. Simply put, these features can be extracted and analyzed to understand the wave form’s properties. When extracting features within the time-domain, we will generally study the amplitude of each sample. For copyright purposes, I will not be able to share the songs in question, however I will share the output plots as well as the genre of the songs. As always, the repository of the all the code can be found on the Learning from Audio GitHub repository.
A Deep Dive into OpenAI CLIP with Multimodal neurons
       
A Deep Dive into OpenAI CLIP with Multimodal neuronsPhoto by Moritz Kindler on UnsplashA few months ago, OpenAI released CLIP which is a transformed-based neural network that uses Contrastive Language–Image Pre-training to classify images. The best thing about CLIP is that a few days ago, another small paper was released to explore CLIP’s interpretability. Source: OpenAIThis paper is quite interesting because it reveals a lot of useful information that helps to explain why CLIP performs so well. However, I want to explain how the most common model interpretability techniques here work in general before talking about CLIP. Deep Dream image generationThe key difference between Deep Dream image generation and class-specific image generation is that the starting image is no longer random.
Feature Importance in Linear Models
       
Standardized dataset or notThe answer is: ONLY IF the dataset was standardized before training, the coefficients can be used as feature importance. For example, if we applied a standard scaler to the raw dataset, then fit it to the model, we can say that the feature importance of age_of_a_house is 20. Coefficients are feature importance of linear models if the dataset was standardized. Linear models have different opinionsDifferent linear models could have totally different opinions about feature importance. Be mindful that if the coefficients change significantly between folds, we should be cautious of using them as feature importance.
Model selection in machine learning
       
(b) Model selection in machine learningModel selection and validationIn my previous post, we went over polynomial regression. Then we divide our training dataset into 3 parts: a training, a validation (sometimes called development), and a test dataset. Then we train our model on the training dataset, perform model selection on the validation dataset, and do a final evaluation of the model on the test dataset. When we end up fitting our model perfectly to out training dataset, we say that we are overfitting. This makes it possible to repeat the training-validation process k times, eventually going through the entire original training set as both training and validation set.
Is Federated Data Sharing Our Last Great Hope to Scare Off the Next AI Winter?
       
Is Federated Data Sharing Our Last Great Hope to Scare Off the Next AI Winter? Even within organisations data can be stored away in silos — often leading frustrated data teams to repeat work or miss potential gold mines. There have been shifts to more centralised and/or decoupled architectures like the data lake, lakehouse, or data mesh which do hold some promise of a solution for internal data sharing. There are solutions on the horizon falling under the umbrella of ‘Federated Data Sharing’. Further readingAn in-depth look at federated sharing in financial services:And healthcare:A great review of some recent ethical guidelines can be found here:A great look at the requirements for data scientists and data engineers:
How to Apply Transfer Learning Successfully
       
Photo by Scott Graham on UnsplashThe availability of highly accurate deep learning models trained on 1000s of compute hours has led to the adoption of transfer learning in production. Note: You can also find this post hereThe past decade has made leaps in the area of Deep Learning research, automating several arduous tasks. In this article, we apply transfer learning to an Image Classification task using pre-trained model weights and fine-tuning to our case. For example, a CNN model trained for Adversarial networks might not be able to solve an Image Classification task with the same high accuracy. Transfer learning guide based on Dataset size and similarity to pre-trained model’s datasetSimilarly, in the case of NLP tasks, we can utilize the pre-trained word embeddings and train the rest based on our model architecture.
Siamese Networks Introduction and Implementation
       
Less data means the deep learning model will not be able to model different classes properly and will perform poorly. Siamese Network basic structureA Siamese network is a class of neural networks that contains one or more identical networks. For same class input pairs, target output is 1 and for different classes input pairs, the output is 0. For same class pairs, distance is less between them. For different pairs, distance is more.
6G and Artificial Intelligence with Security Problems
       
While machine learning offers significant advantages for 6G, AI models’ security is ignored. This post has proposed a mitigation method for adversarial attacks against 6G machine learning models with adversarial learning. The main idea behind adversarial attacks against machine learning models is to produce faulty results by manipulating trained deep learning models for 6G applications. ",title = "6G with Adversarial Machine Learning Security Problems: Millimeter Wave Beam Prediction Use-Case",year = "2021",url = "https://arxiv.org/abs/1905.01999",note = "[arXiv:1905.01999 ]"}1. We assume the classifier h has been trained to minimize a loss function l as follows:Image by AuthorWe modified the optimization function for a 6G security problem.
Solving Cybersickness with AI
       
Solving Cybersickness with AI360° VR experiences can transport you anywhere in the world. No matter how you move in the real physical world, you remain frozen in the virtual world. 6 DOF motion includes 3 DOF from rotational motion (looking about) and 3 DOF from positional motion (moving about). Technically speaking, synthesising movement in 360° VR environments using AI enhances the experience by providing 6 DOF motion — in a sense retro-fitting 3 DOF 360° content with 6 DOF support. Users experienced a 360° VR scene with 3 DOF using standard 360° VR technology and also with 6 DOF provided by Kagenova’s copernic360.
This is why your deep learning models don’t work on another microscopy scanner
       
First of all, because there are quite some microscopy scanner manufacturers that do a really good job and deliver fine products. Additionally, a microscopy scanner costs in the order of 100k€, so it’s not something that you just buy more than you need. And this is what I expected from a much much more expensive microscopy scanner as well at first. Let’s now get back to our original question: Why don’t deep learning models work on images that are from another lab? That’s, on the other hand, not so surprising after all, since also one scanner of the TUPAC16 data set was an (older model) Aperio scanner.
XGBoost for Regression
       
Tutorial OverviewThis tutorial is divided into three parts; they are:Extreme Gradient Boosting XGBoost Regression API XGBoost Regression ExampleExtreme Gradient BoostingGradient boosting refers to a class of ensemble machine learning algorithms that can be used for classification or regression predictive modeling problems. XGBoost Regression APIXGBoost can be installed as a standalone library and an XGBoost model can be developed using the scikit-learn API. # check xgboost version import xgboost print(xgboost.__version__) 1 2 3 # check xgboost version import xgboost print ( xgboost . An XGBoost regression model can be defined by creating an instance of the XGBRegressor class; for example:... # create an xgboost regression model model = XGBRegressor() 1 2 3 . XGBoost Regression ExampleIn this section, we will look at how we might develop an XGBoost model for a standard regression predictive modeling dataset.
PDF document pre-processing with Amazon Textract: Visuals detection and removal
       
Amazon Textract can detect text in a variety of documents, including financial reports, medical records, and tax forms. These visuals contain embedded text that convolutes Amazon Textract output or isn’t required for your downstream process. This information isn’t needed in downstream processes, and you have to remove it before using Amazon Textract to analyze the document. For the second method, we write a custom pixel concentration analyzer to detect the location of these visuals. This is done by filtering out sections using minimum and maximum black pixel concentration thresholds.
Train an Hand Detector using Tensorflow 2 Object Detection API in 2021
       
Image by AuthorTrain an Hand Detector using Tensorflow 2 Object Detection API in 2021We use Google Colab to train our custom object detector on a dataset of egocentric hand images Aalok Patwardhan Mar 5·6 min readI wanted to make a computer vision application that could detect my hands in real-time. There were so many articles online that used the Tensorflow object detection API and Google Colab, but I still struggled a lot to actually get things working. ?Things we will be using:Google ColabTensorflow Object Detection API 2The Egohands dataset: http://vision.soic.indiana.edu/projects/egohands/Steps:1. Set up the environmentOpen up a new Google Colab notebook, and mount your Google drive. Google Colab will be using Tensorflow 2, but just in case, explicitly do this:
Analyzing Facial Recognition Patents with LDA Topic Modeling
       
Topic Modeling with Latent Dirichlet AllocationHow does topic modeling work? Implementing Topic Modeling on Patent DataLet’s look at an implementation of topic modeling on a real dataset. We put our tf-idf Bag of Words corpus into a LDA topic model using Gensim’s LDAMulticore model. Facial recognition patents in the U.S. have apparently focused heavily on biometric authentication, with a secondary focus on photography. Facial recognition patents in China, on the other hand, have been extremely erratic since 2009.
Aggregation of Reddit Comments Using a K-means Clustering Algorithm & RobertaTokenizer
       
Scikit-learn K-means clustering algorithm will help in this process. The TfidfVectorizer and HuggingFace Roberta tokenizer will help to prepare the input data for K-means clustering algorithm. K-means clustering algorithm collects the data points and aggregates those data points together based on various similarities. From this, we can conclude that the tokenization process affects the clustering process in k-means clustering. ConclusionGroups created by the K-means clustering will help you to ignore comments that have no new information.
Applying Machine Learning to Assess Florida’s Climate-Driven Real Estate Risk (part1)
       
Inferred data types of featuresLoad the Property DataWe need to clean up the raw data before we can build our model. : Removes columns from a feature matrix that have higher than a set threshold of null values. remove_single_value_features : Removes columns in feature matrix where all the values are the same..: Removes columns in feature matrix where all the values are the same.. remove_highly_correlated_features : Removes columns in feature matrix that are highly correlated with another column. df_t = remove_single_value_features(df_t) """Removes columns in feature matrix that are highly correlated with another column.""" df_t = remove_highly_correlated_features(df_t)Clean up the Flood Risk DataThe Flood Risk Data is a prominent feature of this study, so we want to clean up the formatting a bit.
Introduction to Global Optimization Algorithms for Continuous Domain Functions:
       
Introduction to Global Optimization Algorithms for Continuous Domain Functions:Photo by Salmen Bejaoui on UnsplashIn this article, I will outline the implementation of the Ant Colony Optimization (ACO) algorithm (with sample code) and how to use it to solve the optimization (minimization) of some common benchmark continuous domain functions. Global optimization algorithms can be broadly categorized as follows:Deterministic global optimization [8]Metaheuristic global optimization [9]ACO is a nature inspired metaheuristic optimization routine and this article will focus primarily only on this algorithm. Problem SpecificationThe problem specification is to use the ACO algorithm to solve the minimization of 5 benchmark classical continuous domain functions. Benchmark Test ResultsIn addition to the results obtained for the Ackley continuous domain problem/function (described in section 3). In my future articles, I will be exploring the use of other nature inspired metaheuristic algorithms for the optimization of continuous domain problems.
Why Aspiring Data Scientists Should Not Make a Big Deal of Machine Learning
       
As a data scientist, you should be able to retrieve the data you need from a relational database. SQL is not only used for retrieving data but also as an efficient data analysis tool. A more general term used for such operations is data wrangling. There are many software libraries for data wrangling such as Pandas for Python and Tidyverse for R.You should master at least one tool for data wrangling. However, you should at least be able to access and retrieve data from the cloud.
A Comprehensive Mathematical Approach to Understand AdaBoost
       
The AdaBoost AlgorithmTo start us off, let’s take a look at the mathematical summary of the AdaBoost algorithm. The elements of statistical learning: data mining, inference, and prediction. Instead, at the beginning of each training iteration, we sample the dataset using weights to get the actual training data. Training Dataset sampled from the original datasetResult of sampling our dataset2. As a result, as we go into further iterations, the training data will mostly comprise data that is often misclassified by our stumps.
Insight into a few basic deep learning algorithms
       
Photo by Luca Bravo on UnsplashDefinitionWikipedia has defined deep learning as:Deep learning is a class of machine learning algorithms that uses multiple layers to progressively extract higher level features from the raw input. In this article, we are going to discuss few deep learning algorithms such as deep belief networks, generative adversarial networks, transformers and graph neural network. RBMs have three parts:-Input Layer Hidden Layer BiasIn the example that I gave above, visible units are nothing but whether you like the restaurant or not. Using this probability, hidden unit can turn on or turn off any of the nodes in visible unit. Each sub network’s hidden layer serves as a visible layer for next network.
Custom Audio Classification with TensorFlow
       
In this section, we’ll go over the creation of our audio data: We cover creating training, validation, and test data. Going on, we implement a function that returns all the speech samples available for a speaker. Note that so far we have not created any training data, we merely sort our speech samples into non-overlapping splits. We load the speakers, and for each gender use the first 80 % of the speakers as training speakers, the next 10 % as validation speakers, and the last 10 % as test speakers. For a list of file paths (speech_list), we load the speech samples , randomly permutate the audio, and store them.
Deep Learning for Land Cover Classification of Satellite Imagery Using Python
       
Deep Learning for Land Cover ClassificationDeep learning. Moreover, deep learning methods are beating out traditional machine learning approaches on virtually every single metric. Deep Neural Networks(DNNs), Convolutional Neural Networks(CNNs) are popularly used for land cover classification. Convolutional Neural Networks(CNNs)According to “CS231n: Convolutional Neural Networks for Visual Recognition”, Convolutional Neural Networks are very similar to ordinary Neural Networks: they are made up of neurons that have learnable weights and biases. Let’s start coding..,Read Datalet’s read the 12 bands using rasterio and stack them into an n-dimensional array using numpy.stack() method.
Google AI Blog: A New Lens on Understanding Generalization in Deep Learning
       
Test soft-error for ideal world and real world during SGD iterations for ResNet-18 architecture. The real world model is trained on 50K samples for 100 epochs, and the ideal world model is trained on 5M samples for a single epoch. For example, some advances like convolutions, skip-connections, and pre-training help primarily by accelerating ideal world optimization, while other advances like regularization and data-augmentation help primarily by decelerating real world optimization. The main benefit of data-augmentation is through the second term, prolonging the real world optimization time. Concluding ThoughtsThe Deep Bootstrap framework provides a new lens on generalization and empirical phenomena in deep learning.
Maximum Entropy RL (Provably) Solves Some Robust RL Problems
       
Our analysis provides a theoretically-justified explanation for the empirical robustness of MaxEnt RL, and proves that MaxEnt RL is itself a robust RL algorithm. In the rest of this post, we’ll provide some intuition into why MaxEnt RL should be robust and what sort of perturbations MaxEnt RL is robust to. Standard RL MaxEnt RL Trained and evaluated without the obstacle: Trained without the obstacle, but evaluated with the obstacle:TheoryWe now formally describe the technical results from the paper. Standard RL MaxEnt RL Evaluation on adversarial perturbationsMaxEnt RL is robust to adversarial perturbations of the hole (where the robot inserts the peg). ConclusionIn summary, our paper shows that a commonly-used type of RL algorithm, MaxEnt RL, is already solving a robust RL problem.
Translate video captions and subtitles using Amazon Translate
       
Amazon Translate supports the ability to ignore tags and only translate text content in HTML documents. Translate this delimited file using the asynchronous batch processing capability in Amazon Translate. Welcome to the blog demonstrating the ability to 2 00:00:07,000 --> 00:00:11,890 translate from one language to another using Amazon Translate. ConclusionIn this post, we demonstrated how to translate video captions and subtitles in WebVTT and SRT file formats using Amazon Translate asynchronous batch processing. For more information, see Translating documents with Amazon Translate, AWS Lambda, and the new Batch Translate API.
Batch image processing with Amazon Rekognition Custom Labels
       
Amazon Rekognition Custom Labels allows you to identify the objects and scenes in images that are specific to your business needs. Amazon Rekognition Custom Labels provides a simple end-to-end experience where you start by labeling a dataset, and Amazon Rekognition Custom Labels builds a custom ML model for you by inspecting the data and selecting the right ML algorithm. It takes advantage of AWS services such as Amazon EventBridge, AWS Step Functions, Amazon Simple Queue Service (Amazon SQS), AWS Lambda, and Amazon Simple Storage Service (Amazon S3). If there are items to process in the queue, the workflow starts the Amazon Rekognition Custom Labels model. To learn how to train a model, see Getting Started with Amazon Rekognition Custom Labels.
Boost your Network Performance
       
Boost your Network PerformanceYou prepared a big dataset with millions of samples, designed a state-of-the-art neural network, trained it for 100 epochs, but couldn’t get a satisfactory result? Cyclical Learning Rates (CLR) and the Learning Rate Range Test (LRR) are useful procedures that define the most appropriate learning rate for a particular use case. Still, a smaller value is necessary when choosing a constant learning rate, or the network will not begin to converge. Some researchers recommend using one cycle smaller than the total number of iterations/epochs and allow the learning rate to decrease less than the initial learning rate. A disciplined approach to neural network hyper-parameters: Part 1 — learning rate, batch size, momentum, and weight decay.
Classification Using Bi-LSTM Neural Network
       
When you train a model with labelled data (like regression in figure 6, decision tree, SVM…), you are performing supervised learning. Your model tries to reduce its error by comparing its predicted output to the labelled data. This is why unsupervised learning is generally less accurate and people tend to avoid it. The advantage is that you don’t need to have labelled data, which is usually more expensive to obtain. You usually set the initial parameters in your model, and change these parameters until you find the best combination (aka.
Open source operationalisation: identifying solutions to help achieve three-star deployments
       
Open source operationalisation: identifying solutions to help achieve three-star deploymentsImage by Karolina Grabowska on PexelsIn my previous articles, I talked about the similarities between operationalisation of open source analytics, and working in a restaurant. This should help you to achieve open source operationalisation worthy of three stars! This means diverse analytics solutions working together, which is exactly what we need to effectively operationalise open source. For technology, the enterprise needs to strike a balance between choice for its users and control for governance. Final thoughtsI hope that it will be clear that collaboration, shared knowledge of the process and openness are key elements to achieving high quality, efficient and repeatable open source operationalisation.
5 Simple Questions to Find Data for a Machine Learning Project
       
5 Simple Questions to Find Data for a Machine Learning ProjectFind the data you need to kick off a machine learning project by considering what data is available and what business need it satisfies. Cory Randolph 19 hours ago·3 min readPhoto by Markus Winkler on UnsplashMany businesses want innovation, and Machine Learning often comes up as a suggestion; but how can you find the data you need to try out a machine learning project? Below are 5 straightforward questions that I have found useful in finding data for new machine learning projects. Financial data is a great place to start a machine learning/data analysis project because it usually meets all the other data questions mentioned above. If you’re looking for a place to start, here are three potential use cases for finance data with machine learning:
What is the Best Facial Recognition Software to Use in 2021?
       
I’ve done my best to make a comprehensive list of all the modern face recognition solutions on the market. Types of Facial Recognition Solutions Available on the MarketThe first thing you should know is that there’s a huge variety of facial recognition solutions. I would split facial recognition services into three types, each with its own advantages and disadvantages. Before we get down to comparing the best facial recognition software, I want to clarify that I’ve chosen accuracy as a key parameter for my research. Best Free Facial Recognition Software Solutions to Use in 2021This is probably the most popular free face recognition library, as it has 37.6k stars on GitHub.
Apache Spark on Windows: A Docker approach
       
Apache Spark on Windows: A Docker approachPhoto by Caspar Camille Rubin on UnsplashRecently I was allocated to a project where the entire customer database is in Apache Spark / Hadoop. Ex: a project can use Apache Spark 2 with Scala and another Apache Spark 3 project with pyspark without any conflict. With all that said, let’s get down to business and set up our Apache Spark environment. Install Docker for WindowsYou can follow the start guide to download Docker for Windows and go for instructions to install Docker on your machine. To learn more about docker start options you can visit Docker docs.
Demystifying Neural Networks Pt.2
       
Image by authorObviously, no straight line will ever perfectly fit the data above and we expect Logistic Regression to perform poorly in such a scenario. Image by author1-Hidden Layer Neural NetworkLet’s add one hidden layer to our network. We now have one input layer, one hidden layer and one output layer. Note that the σ′(zₗ₋₁)=σ(zₗ₋₁)*(1−σ(zₗ₋₁)) but, by definition, σ(zₗ₋₁) is the hidden layer activation aₗ₋₁. The only difference is that we have to update both the output and the hidden layer weights.
What is Cloud Computing? The Key to Putting Models into Production
       
Big Data & Cloud ComputingAlthough “Big Data” has been a buzzword lately and has gained interest from big businesses, it’s typically difficult to define. Although I’m not going to dive into details in this blog, it’s important to note that there are three cloud computing models. It’s also subject to the concerns that many other cloud computing platforms face. Also, as a cloud computing service, it’s highly scalable while being affordable. Google Cloud Platform (GCP)Just like AWS and Azure, GCP is a suite of public cloud computing services offered by Google.
A Journey Through Neural Networks (Part 0) —Fast Introduction to Linear Algebra
       
More on Hadamard productLinear transformationLets call T our linear transformation. First, to be sure that our linear transformation T is really a linear transformation, it must follow some rules (Figure 12). The linear transformation T from ℝ⁴→ ℝ² defined by T(w, x, y, z) = (x-y+z, w-z+y) is given by the matrix:Figure 14. Derivative of f(x, y) = 3x³ + 4y with respect to x (≈ y set to a constant). With that and everything else explained above, you should be able to differentiate the function f(x, y) with respect to y.
Basin Hopping Optimization in Python
       
How to use the basin hopping optimization algorithm API in python. Tutorial OverviewThis tutorial is divided into three parts; they are:Basin Hopping Optimization Basin Hopping API Basin Hopping Examples Multimodal Optimization With Local Optima Multimodal Optimization With Multiple Global OptimaBasin Hopping OptimizationBasin Hopping is a global optimization algorithm developed for use in the field of chemical physics. Now that we are familiar with the basic hopping algorithm from a high level, let’s look at the API for basin hopping in Python. Basin Hopping APIBasin hopping is available in Python via the basinhopping() SciPy function. Basin Hopping ExamplesIn this section, we will look at some examples of using the basin hopping algorithm on multi-modal objective functions.
Google AI Blog: Accelerating Neural Networks on Mobile and Web with Sparse Inference
       
On-device inference of neural networks enables a variety of real-time applications, like pose estimation and background blur, in a low-latency and privacy-conscious way. One way to optimize a model is through use of sparse neural networks [1, 2, 3], which have a significant fraction of their weights set to zero. In this post, we provide a technical overview of sparse neural networks — from inducing sparsity during training to on-device deployment — and offer some ideas on how researchers might create their own sparse models. If so, it switches from its standard dense inference mode to sparse inference mode, in which it employs a CHW (channel, height, width) tensor layout. Future workWe find sparsification to be a simple yet powerful technique for improving CPU inference of neural networks.
Maximum Entropy RL (Provably) Solves Some Robust RL Problems
       
Maximum Entropy RL (Provably) Solves Some Robust RL ProblemsNearly all real-world applications of reinforcement learning involve some degree of shift between the training environment and the testing environment. In a recent paper, we prove that every MaxEnt RL problem corresponds to maximizing a lower bound on a robust RL problem. In the rest of this post, we’ll provide some intuition into why MaxEnt RL should be robust and what sort of perturbations MaxEnt RL is robust to. ConclusionIn summary, this paper shows that a commonly-used type of RL algorithm, MaxEnt RL, is already solving a robust RL problem. We do not claim that MaxEnt RL will outperform purpose-designed robust RL algorithms.
Using integrated ML to deliver low-latency mobile VR graphics
       
What it is:A new low-latency, power-efficient framework for running machine learning in the rendering pipeline for standalone VR devices that use mobile chipsets. This architecture makes it possible to use ML on these devices to significantly improve image quality and video rendering. We have created a sample application under this framework to reconstruct higher-resolution rendering (known as super-resolution) to improve VR graphics fidelity on a mobile chipset with minimal compute resources. How it works:In a typical mobile VR rendering system, the application engine retrieves movement-tracking data at the beginning of each frame and uses this information to generate images for each eye. Why it mattersCreating next-gen VR and AR experiences will require finding new, more efficient ways to render high-quality, low-latency graphics.
Active learning workflow for Amazon Comprehend custom classification models – Part 2
       
The Amazon Comprehend custom classification API enables you to easily build custom text classification models using your business-specific labels without learning machine learning (ML). The function transforms the human-reviewed data payload to a comma-separated training data format, required by Amazon Comprehend custom classification models. Create a human review workflowTo create your human review workflow, complete the following steps:On the Amazon A2I console, choose Human review workflows. This post provides a reusable pattern and infrastructure for active learning workflows for custom classification models. To learn more about automatic model building, selection, and deployment of custom classification models, you can refer to Active learning workflow for Amazon Comprehend custom classification models – Part 1.
Active learning workflow for Amazon Comprehend custom classification models – Part 1
       
The Amazon Comprehend custom classification API enables you to easily build custom text classification models using your business-specific labels without learning ML. Solution architectureIn this two-part series, we discuss an architecture pattern that allows you to build an active learning workflow for Amazon Comprehend custom classification models. Step Functions workflowThe following diagram shows the Step Functions workflow for automatic model building, endpoint creation, and deploying Amazon Comprehend custom classification models. For more information about the feedback loops and human review workflow, see the second part of this blog series, Active learning workflow for Amazon Comprehend custom classification models – Part 2. For more information about custom classification in Amazon Comprehend, see Custom Classification.
Introducing a new API to stop in-progress workflows in Amazon Forecast
       
Stop a resource job that is importing your datasetsYou have two options to stop importing a dataset. Stop a resource job that is training a predictorYou have two options to stop a resource job that is training your predictor. Stop a resource job that is generating forecastsYou have two options to stop a resource job that is generating your forecasts. Stop a resource job that is exporting forecastsLastly, you can stop a resource job that is exporting your forecasts. One option is to select the forecast export job listed in the Forecast details section and choose Stop.
A live AI web-server with Intel Neural Compute Stick and RaspberryPi
       
The softwareThis project is made of three different components and each of them is characterised by a specific tool. We need to capture images using the webcam. Machine learningGiven an image the tricky part, the raspberry has to ask the Neural Compute Stick to infer a label and get back the result. res = self.exec_net.infer(inputs={self.input_blob: images})ServerSo, at this point we have an image and a vector res with information about what the webcam is looking at. Note the frame = model.process_frame(frame) which sends the frame to the Compute Stick and sends it back it with the labels.
How to go from raw data to production like a pro
       
Putting all together: from raw data to productionIn real-world scenarios, it is very common to bump into very imbalanced datasets. High imbalanced data. Data source: Credit card Fraud from KaggleThe use of tools such as synthetic data generators, allows you to quickly learn and replicate the patterns from the underrepresented classes. The YData’s open-source repository allows you to explore and experiment with different synthetic data generation methods. When using the ydata-synthetic library you can easily generate synthetic data with a few lines of code:Generating synthetic data with WGAN-GPThrough the combination of the new synthetic frauds with the real data, you can easily improve the quality and the subsequent accuracy of your classifier.
Implementing Explainability for Gradient Boosting Trees
       
For example, decision trees. Decision trees perform a set of hierarchical decisions to achieve a solution (e.g. Image by Authors: Decision Tree prediction node by nodeIn a Decision tree, the prediction is:where leaf(x) is the leaf node of the tree where the sample x falls inGB (as well as RF) is an ensemble of decision tree models. Ensemble means that GB is a set of decision trees that perform a prediction working together. Image by Authors: Gradient boosting prediction explained at tree contribution levelYou may wonder why explainability concerns us this much.
Visualizing NASA’s Meteorite Dataset
       
Eventually I settled on the ‘Meteorite Landings’ dataset from NASA. For example, this dataset has ‘reclat’ and ‘reclong’ (latitude and longitude) contain the same information as ‘GeoLocation’ but formatted differently. This is because there are labels that are in one dataset that aren’t in another i.e. When they got combined, the functionality made all the missing data ‘NaN’. From making sure the data is in the right format to be read by your model, not repeating yourself, and/or dealing with missing data.
What is machine learning?
       
It seems, most people derive their definition of machine learning from a quote from Arthur Lee Samuel in 1959: “Programming computers to learn from experience should eventually eliminate the need for much of this detailed programming effort.” The interpretation to take away from this is that “machine learning is the field of study that gives computers the ability to learn without being explicitly programmed.”Machine learning draws a lot of its methods from statistics, but there is a distinctive difference between the two areas: statistics is mainly concerned with estimation, whereas machine learning is mainly concerned with prediction. Categories of machine learningThere are many different machine learning methods that solve different tasks and putting them all in rigid categories can be quite a task on its own. Supervised learningSupervised learning refers to a subset of machine learning tasks, where we’re given a dataset of N input-output pairs, and our goal is to come up with a function h from the inputs to the outputs. Unsupervised learningAnother subset of machine learning tasks fall under unsupervised learning, where we’re only given a dataset of N input variables. Supervised learning refers to machine learning tasks, where we are given labeled data , and we want to predict those labels.
Getting Started with SageMaker for Model Training
       
Create a SageMaker Estimator scriptWe need a Python script that uploads our training data to S3, defines our SageMaker Estimator, and launches the training job. You can make your model script really flexible to accept different parameters for data preprocessing and model training. You can make your model script really flexible to accept different parameters for data preprocessing and model training. Here’s the full script to kick-off model training which we’ll do after we create the model script. In this context, a hyperparameter can be any argument you’d like to pass to your model script for data pre-processing as well as model training.
Box-Cox transformation is the magic we need
       
Box-Cox transformation is the magic we needPhoto by Mathew Schwartz on UnsplashWhen dealing with the statistical analysis we usually want to have normally distributed data. One of the most popular methodology to achieve that is the Box-Cox transformation. What is data transformation? Thus after linear transformation, we cannot observe the change in data distribution and our data won’t look more normal than before. Image from https://en.wikipedia.org/wiki/George_E._P._BoxThe Box-Cox transformation is a non-linear transformation that allows us to choose between the linear and log-linear models.
fit() vs predict() vs fit_predict() in Python scikit-learn
       
Then we instantiate a SVC classifier and finally call fit() to train the model using the input training and data. fit (X, y, sample_weight=None): Fit the SVM model according to the given training data. X — Training vectors, where n_samples is the number of samples and n_features is the number of features. For the SVC classifier in particular, you can find the available fitted parameters in this section of the documentation. Essentially, this method will fit and perform predictions over training data thus, is more appropriate when performing operations such as clustering.
Human Priors in Object Detection
       
Image by AuthorHuman Priors in Object DetectionFrom all of the problems I have worked with in computer vision, the most challenging one is the object detection. To show what I mean by injecting some strong priors, let’s consider the most widely known human prior in object detection: the anchors. Figure 2: Anchors Types (Image by Author)The anchor injection in object detection proposes the following process. If the center of target object to predict falls within a given cell of the grid, that cell is responsible for predicting that object. Anchor-less priorsSo are there any other alternatives for object detection priors?
Deploying a REST API to Serve a Face Profile Model
       
We zip up the deployment_package, then register inputs and outputs of your deployment request using the Ubiops API. It contains the following keys:- deployment (str): name of the deployment- version (str): name of the version- input_type (str): deployment input type, either'structured' or 'plain'- output_type (str): deployment output type, either'structured' or 'plain'- language (str): programming language the deploymentis running- environment_variables (str): the custom environmentvariables configured for the deployment. See bottom of the .gitignore for more detailsDeployYou can deploy the deployment_package.zip using the Ubiops API. Configure the Ubiops API ConnectionThis step requires you to manually obtain an api key with the Ubiops UI, and create a new project. In there, it shows how to utilize some of the exceptions generated by the Ubiops API to do thing like basic, but automatic upversioning.
How Can You Use AI to Prevent Cyberbullying?
       
How Can You Use AI to Prevent Cyberbullying? Three years ago, Toxic Comment Classification Challenge was published on Kaggle. Since BERT’s goal is to generate a language model, only the encoder mechanism is necessary. To learn more about BERT, read BERT Explained: State of the art language model for NLP by Rani Horev. Let’s load the BERT model, Bert Tokenizer and bert-base-uncased pre-trained weights.
Training DETR on Your Own Dataset
       
All object detection algorithms have their pros and cons; R-CNN (and its derivatives) is a 2-step algorithm, requiring both a region proposal computation and a detection computation. In contrast, DETR is a direct set solution to the object detection problem. DETR usually requires a very intensive training schedule. We are interested in fine-tuning a pretrained DETR model on a personal dataset, potentially with a different number of classes than COCO. The original COCO dataset has over 200,000 annotated images with over 1.5 million instances split over 90 classes, so if your dataset is comparable it may improve your training schedule to set pretrained = False in the notebook.
How to detect online trends without web scraping
       
Web scraping has become the default way to extract data from websites. Web scraping is used, among many other things, for price monitoring, watching competitors and website updates detection. Although with the popularity of scraping came a lot of libraries and other tools to make it easier. You can easily extract all the text data from the source code of the website. In this article, I will show you how to extract data from websites using methods that are successfully used in processing scanned documents.
(Deep) House: Making AI-Generated House Music
       
Generative Audio ApproachesHistorically, generative audio models were trained on datasets of these symbolic musical representations. However, because they learn distributions of parameters rather than actual values, the generated samples can be imprecise. The generated audio might sound reasonable on a small timescale, but it doesn’t sound quite as good over a longer timeframe. (Image by Dhariwal 2020)The top-level prior is trained on~50 hours of deep house music with the vocal track removed (c/o spleeter). The level 2 generated audio samples contain the long-range structure and semantics of the music but are low audio quality due to the compression.
An Introduction to Linear Regression
       
Rather, I was pondering the fabulous rollercoaster of a ride of $GME as a timely example of something to apply linear regression to; one can apply a linear regression model to a given stock to predict its price in the future. Linear regression just means that you are going to do something using a linear collection of parameters. Linear Regression | Source WikipediaWe have our linear regression, which is most commonly used to describe a straight line like the picture above — but that would also be true for logistic and polynomial. I doubt it can be called a linear regression anymore, but it is certainly a regression. Evaluation of regression modelsNow that we have built a regression model, we can quantitatively evaluate the performance of our regression model.
Tuning Hyperparameters with Optuna
       
Hyperparameters and scikit-learn Tunning MethodsIn a machine learning project, tuning the hyperparameters is one of the most critical steps. Since the ML model cannot learn the hyperparameters from the data, it is our responsibility to tune them. Random search (RandomizedSearchCV and HalvingRandomSearchCV)These methods also require us to set up a grid of hyperparameters. These methods are possible because those models can fit data for a range of some hyperparameters' values almost as efficiently as fitting the estimator for a single value of the hyperparameters. Since some models have quite a few hyperparameters, the grid search methods of scikit-learn will get very expensive pretty fast.
Fixed Effect Regression — Simply Explained
       
With the spirit of learning by explaining, I decided to write a blog to explain the fixed effect regression model and its implementation in Python. This blog will incorporate three parts:What is the fixed-effect model, and why we want to use it? P(observing Y| what if I had not done X)So, a causal effect is a difference in outcome if we do a certain thing V.S. You may also think, why can’t I use the estimator/coefficient of my variable of interest directly after training my model. These variables are important in the sense that they are both correlated to our variable of interest (more likely to see the new feature) and correlated to our outcome variable (spend more).
The Realities of Socially Conscious Machine Learning
       
The Realities of Socially Conscious Machine LearningPhoto by Jen Theodore on UnsplashWhen Dr. Latanya Sweeney Ph.D. first arrived at Harvard as a visiting professor, she had already published groundbreaking research on data anonymization and privacy. An algorithm seemingly decided that Dr. Sweeney’s first name — Latanya — was more likely given to a person of color. Examples of ads for “Latonya Evans,” “Latisha Smith” — (Sweeney, 2013)What happened next is what one expects from a Computer Scientist. This piece draws on algorithmic bias research, federal guidance, legal precedent, and other research to help navigate this dilemma. It aims to help practitioners (at all stages) begin to develop an intuition for detecting, understanding, and, whenever possible, correcting harmful behaviors that embed themselves in Machine Learning applications.
How to change the world with data science
       
How to change the world with data sciencePhoto by Juliana Kozoski on UnsplashThings get done only if the data we gather can inform and inspire those in a position to make a difference — Michael J. SchmokerToday, there are almost a million people working globally in a data science related field. Almost every application of data science in today’s world is focused on making already comfortable lives even more comfortable. How to use data science for social good? While this is an application of data science that generates the most revenue, it certainly isn’t the most meaningful one. The application of data science to combat social issues is called social data science, and can have a world changing impact.
How To Answer Any Machine Learning System Design Interview Question
       
How To Answer Any Machine Learning System Design Interview QuestionThis template will guide you through almost any ML system design question that you can get in an interview. It is important to note that this template is intentionally generic so that when you find a new system design question, it is easy to fill in each section.​Below is an overview of the steps you should take when you have an ML System Design interview:​When you are answering an ML Design interview question, the two areas to focus on is Data and Modeling. ​You should have written those answer down on the whiteboard (or the online platform if you are doing the interview virtually). Feature Selection:If we are using a deep neural network, then we do not need feature selection. Some models need to be retrained every day, some every week and others monthly/yearly.
Random Search and Grid Search for Function Optimization
       
This can be achieved using a naive optimization algorithm, such as a random search or a grid search. Tutorial OverviewThis tutorial is divided into three parts; they are:Naive Function Optimization Algorithms Random Search for Function Optimization Grid Search for Function OptimizationNaive Function Optimization AlgorithmsThere are many different algorithms you can use for optimization, but how do you know whether the results you get are any good? Random Search for Function OptimizationRandom search is also referred to as random optimization or random sampling. Random search involves generating and evaluating random inputs to the objective function. plot ( sample [ best_ix ] [ 0 ] , sample [ best_ix ] [ 1 ] , '*' , color = 'white' ) # show the plot pyplot .
My Experience as a Product Data Analyst
       
I’m currently a product analyst for a mobile app and before my interview I downloaded the app to try. As a product analyst, I can use the app myself to evaluate new features and relate to the user experience. Mobile versus webEven if you’ve been a product analyst before there are additional considerations for analysis when you support a mobile product. Make sure you understand A/B testing concepts before becoming a product analyst because it’s very likely you’ll be involved with evaluating tests. It was not a seamless transition switching to a product data analyst role but it’s definitely a worthwhile experience to try and I hope you’ll find it makes you a better data analyst too.
Extraction of Objects In Images and Videos Using 5 Lines of Code.
       
The goal of computer vision is to make it possible for computers to analyze objects in images and videos, to solve different vision problems. Object Segmentation has paved way for convenient analysis of objects in images and videos, contributing immensely to different fields, such as medical, vision in self driving cars and background editing in images and videos. PixelLib uses segmentation techniques to implement objects’ extraction in images and videos using five lines of code. extract_segmented_objects: This is the parameter that tells the function to extract the objects segmented in the image and it is set to true. show_bboxes: This is the parameter that shows segmented objects with bounding boxes.
Quick Setup Guide for your AWS Account
       
Quick Setup Guide for your AWS AccountPhoto by Rafael Garcin on UnsplashCloud technologies are now ubiquitous in industry and are used by technologists of all stripes — not just DevOps engineers. Create an admin IAM userIAM (or Identity and Access Management) is the AWS system for managing access and permissions for the various AWS resources. Fill out the Role name field and click Create role. Install and configure the AWS CLIThe AWS CLI (or command-line interface) will allow you to programmatically run AWS workloads from your local machine or wherever it is installed. Fill in your AWS Access Key ID and AWS Secret Access Key.
Stop Using The Machine Learning Easy Button
       
Photo by Rich Smith on UnsplashStop Using The Machine Learning Easy ButtonWith the rise of machine learning MOOCs and cheaper compute power, it’s becoming much easier for data enthusiasts to explore the depths of machine learning and the data science toolkit. Sprinkle in the continuous success stories surrounding machine learning in popular news, and data laymen begin to have an appetite for data science. Pair that with analysts now learning ways to create deep learning models via a few lines of python code and we quickly forget the fundamentals of good old-fashioned analysis. They know Johnson is a quality analyst, he has the best intentions, heck he even has been taking those machine learning courses from Coursera! Make sure this new problem you are handed can truly benefit from machine learning.
Machine Learning 101: Master ML
       
“Machine learning will automate jobs that most people thought could only be done by people.” — Dave WatersThe field of machine learning is one of the most significant aspects of study in the modern world. Machine learning is gaining more popularity each day, and it is one of the most intriguing emerging trends of the current generation. Artificial Intelligence, Data Science, and machine learning are contributing tremendously to the developments and technologies in the modern era. In this article, our main objective will be to cover most of the essential aspects of machine learning. So, without further ado, let us get in exploring these features of machine learning and the multiple concepts that come alongside them.
6 Must-Know Parameters for Machine Learning Algorithms
       
6 Must-Know Parameters for Machine Learning AlgorithmsPhoto by Thomas Thompson on UnsplashThere are several machine learning algorithms for both supervised and unsupervised tasks. Although these algorithms are ready-to-use, we usually need to tune them through model parameters. The model parameters that we can customize or adjust are known as hyperparameters. Thus, we need to have a comprehensive understanding of the model hyperparameters and their effects. In this article, we will cover 6 critical hyperparameters of commonly used supervised machine learning algorithms.
Gradient Boosted Trees for Classification — One of the Best Machine Learning Algorithms
       
The last part left to do before moving to Tree 2 is recalculating the residuals (model targets). Following the steps in the process map above, this is what we need to do to recalculate the residuals for each observation:'New Log(odds)' = 'Log(odds) from the initial prediction' + 'Value from Tree 1' * 'Learning Rate'. This is exactly why we need to build many trees, as having a larger number of trees will lead to improvements in predictions across all observations. 'New Log(odds) T2' = 'Log(odds) from the initial prediction' + 'Value from Tree 1' * 'Learning Rate' + 'Value from Tree 2' * 'Learning Rate'. Gradient Boosted Trees — model results.
A Journey through XGBoost: Milestone 2
       
Topics we discussForm a classification problemIdentify the feature matrix and the target vectorBuild the XGBoost model (Scikit-learn compatible API)Describe ‘accuracy’ and ‘area under the ROC curve’ metricsand metrics Explain XGBoost Classifier hyperparametersBuild the XGBoost model (Non-Scikit-learn compatible API)XGBoost’s DMatrixCreate a small web app for our XGBoost model with Shapash Python libraryPython library Make some fancy visualizationsPrerequisitesBefore proceeding, make sure that you’ve read the first article of the XGBoost series (A Journey through XGBoost: Milestone 1 — Setting up the background). After that, the XGBoost model (with user-defined parameters) will learn the rules based on X and y. Building the model with scikit-learn compatible APIThe easiest way to build an XGBoost model is to use its Scikit-learn compatible API. Note: Whenever possible, I recommend you to use the XGBoost scikit-learn compatible API. XGBoost Web App Demo (Video by author)In this web app, you can even download individual plots and save them on your local machine.
A Comprehensive Introduction to Bayesian Deep Learning
       
A Comprehensive Introduction to Bayesian Deep LearningPhoto by Cody Hiscox on UnsplashTable of Contents 1. Bayesian Deep Learning5.1 Recent Approaches to Bayesian Deep Learning6. Back to the Paper6.1 Deep Ensembles are BMA6.2 Combining Deep Ensembles With Bayesian Neural Networks6.3 Neural Network Priors6.4 Rethinking Generalization and Double Descent7. Instead of starting with the basics, I will start with an incredible NeurIPS 2020 paper on Bayesian deep learning and generalization by Andrew Wilson and Pavel Izmailov (NYU) called Bayesian Deep Learning and a Probabilistic Perspective of Generalization. Bayesian Deep LearningA Bayesian Neural Network (BNN) is simply posterior inference applied to a neural network architecture.
Predict Customer Churn with Neural Network
       
In this blog post, I decided to start from the opposite side by applying a multilayer perceptron model (neural network) to predict customer churn. Reasons for customer churn can be various and the most typical are poor customer service, not finding enough value in a product or service, lack of customer loyalty and lack of communications. Why it is important to predict customer churn? Predict customer churn with MLPLet us use the MLP model to predict customer churn. Thanks for reading and please do comment below about your ideas on customer churn with machine learning/deep learning.
6 Machine Learning Certificates to Pursue in 2021
       
This article will cover the top 6 machine learning certificates that you can pursue this year and elevate your portfolio and chances to land your dream job. №2: Certificate in Machine Learning By StanfordThe machine learning course and certificate offered by Stanford University is perhaps the better option for those who want to get into machine learning and earn a certificate at the same time. №4: Machine Learning Certificate By HarvardThe final course-based certificate in this list is the machine learning certificate offered by Harvard University on edX. The AWS certified machine learning is a certificate designed to measure one’s ability to design, develop and deploy machine learning models using the AWS cloud. Similar to the Google machine learning engineer certificate, the AWS certified machine learning specialty doesn’t require a specific course completion to obtain.
4 Improvements For Your Data Science Resume
       
With that being said, let us dive deeper into these three ways that you can improve your data science resume. What this improvement is, is a few sentences highlighting your mission in data science. Here are all of the improvements once more, summarized:* Highlight Projects * Style of Writing * Style and Format of Resume * Mission StatementAlso important to note is that all of these resume improvements can be applied to non-data science roles as well. Please feel free to comment down below if you have leveraged any of these data science improvements for your resume — and which ones? Has it helped you in your Data Science career now?
Adaline neural networks: the origins of gradient descent
       
However, this update variable is calculated in a different way, using an algorithm known as the gradient descent. The challenge that gradient descent tackles is how to get from the current point to the weight value that produces the lowest error. The way gradient descent finds its way to the minimum is (using the right mathematical jargon) by computing the partial derivative in respect to the weights. Let’s contextualise gradient descent for adaline training. This is taken from my gradient descent notebook.
Lookahead Decision Tree Algorithms
       
Suppose we build a 2-level decision tree using scikit-learn’s default parameters. Image by authorNote that the decision tree is able to generate 4 final leafs/buckets which the candidate can fall into. The goal of the decision tree is to discriminate potentially successful candidates from unsuccessful ones. Therefore, the decision tree would prefer to split on the local optimum, “Academic Qualifications” rather than doing a deeper search. But, if the decision tree was able to lookahead one step, it would find a more efficient split based on “Candidate Seniority” in the first step and “Role Seniority” in the second step or the other way around.
How to Create Machine Learning UIs the Easy Way
       
How to Create Machine Learning UIs the Easy WaySketch prediction application — image by the authorthe final deployed application — video by the authorIntroductionData scientists excel at creating models representing and predicting real-world data, but actually putting machine learning models in use is more of an art than science. A machine learning model’s goal is to solve a problem, and it can only do that when consumed through a UI. gr.Interface(fn = predict_shape, inputs = input,outputs = output, live = True,title=title, description = description,capture_session=True).launch(debug=True)working gradio interface — image by the authorAwesome! It may take several minutes https://gradio.app/g/salma71/sketch_predict — image by the authorThe deployed interface can then found at the deployed link. You may receive a confirmation email that the interface was successfully deployed — you can check the deployed interface here.
Portfolio Diversification With Emerging Market Bonds
       
The first is the expected returns of the financial assets being considered for the portfolio, typically the geometric mean expected return. The global credit expected return is a 50–50 blend of BI’s expected returns for “US credit” and “EU corporate bonds” respectively. Portfolio returns, however, should be calculated from simple percentage returns of the component assets in the portfolio. Remember that the entire point is to analyse if EM bonds can provide a diversification benefit to a global portfolio. The second is to understand if these two expected volatility distributions are distinct from each other.
Google AI Blog: PAIRED: A New Multi-agent Approach for Adversarial Environment Generation
       
An approach to address this is to automatically create more diverse training environments by randomizing all the parameters of the simulator, a process called domain randomization (DR). This minimax adversary can be trained to minimize the performance of the first RL agent by finding and exploiting weaknesses in its policy, e.g. Using a purely adversarial objective is not well suited to generating training environments, either. Once the protagonist learns to solve each environment, the adversary must move on to finding a slightly harder environment that the protagonist can’t solve. Unlike minimax or domain randomization, the PAIRED adversary creates a curriculum of increasingly longer, but possible, mazes, enabling PAIRED agents to learn more complex behavior.
Multimodal deep learning approach for event detection in sports using Amazon SageMaker
       
With machine learning (ML) techniques, we introduce a scalable multimodal solution for event detection on sports video data. Recent developments in deep learning show that event detection algorithms are performing well on sports data [1]; however, they’re dependent upon the quality and amount of data used in model development. This post explains a deep learning-based approach developed by the Amazon Machine Learning Solutions Lab for sports event detection using Amazon SageMaker. If the badminton class is found among two labels associated with a 1-second sample, vote for the audio model (get the label and probability from the audio model). ConclusionThis post outlined a multimodal event detection approach using a combination of RGB, optical flow, and audio models through robust ResNet50 and MobileNet architectures implemented on SageMaker.
How Does Your Smartwatch Know You’re Standing?
       
How Does Your Smartwatch Know You’re Standing? The Data Set contains the following attributes:Triaxial acceleration from the accelerometer (total acceleration) and the estimated body acceleration. Some distributed in gaussian form and some closer to pareto distributions, this seems fine for a prediction model. Precision for the Position Prediction Model. Recall for the Position Prediction Model.
How to Cluster Data!
       
When working on data science projects, data scientists often encounter datasets that have unlabeled data points. In order to make sense out of senseless data, unsupervised machine learning techniques are deployed to label the data points as well as provide a clear contrast between the different classes. It operates by examining the entire dataset to find similarities amongst the relationships of the variables of data points. Then, it was fitted on the goodreads book dataset in order to attribute labels to data points. Then, each data point’s distance is measured from the centers and the data point is labelled based on its nearest cluster center.
Using Distributed Computing for Neuroimaging
       
Nowadays, due to increase of resolution, magnetic field strength, consortia, and infrastructures also neuroimaging data are big data. From the point of view of data, this is translated to a 4D volume (3D for a brain scan, and then temporally following the blood evolution). Similar scenarios occur with diffusion data, where instead of time point we have different gradient. Those tools allow to read and load imaging data, convert them to Resilient Distributed Datasets where they can be manipulated in parallel, and convert them back into an imaging format as NIFTI. It has been reported that the computational time using (Py)Spark can be reduced to a quarter compared to traditional approaches (see e.g.
Unconventional Sentiment Analysis: BERT vs. Catboost
       
Unconventional Sentiment Analysis: BERT vs. Catboostby the authorIntroductionSentiment analysis is a Natural Language Processing (NLP) technique used to determine if data is positive, negative, or neutral. For conclusions and assessments of the proposed method, I need a baseline model. !pip install tensorflow_hub!pip install tensorflow_textsmall_bert/bert_en_uncased_L-4_H-512_A-8 — Smaller BERT model.This is one of the smaller BERT models referenced in Well-Read Students Learn Better: On the Importance of Pre-training Compact Models. The smaller BERT models are intended for environments with restricted computational resources. They can be fine-tuned in the same manner as the original BERT models.
The Walrus Operator in Python
       
enter the walrus operatorIntroduced in python 3.8, the walrus operator, (:=), formally known as the assignment expression operator, offers a way to assign to variables within an expression, including variables that do not exist yet. As seen above, with the simple assignment operator (=), we assigned num = 15 in the context of a stand-alone statement. In other words, the walrus operator allows us to both assign a value to a variable, and to return that value, all in the same expression. The name is due due to its similarity to the eyes and tusks of a Walrus on its side. name := exprexpr is evaluated and then assigned to the variable name.
What Took Me So Long to Land a Data Scientist Job
       
Without prior job experience, it is hard to demonstrate your skills. I felt like HR professionals did not take my resume into consideration due to lack of job experience in the field. I have got very few technical interviews and I mostly did well. If you do not have prior job experience in data science field, the best alternative is to complete projects. You need to frame a problem and design your approach that aims to solve the problem using data.
Arabic Sentence Embeddings with Multi-Task Learning
       
Arabic Sentence Embeddings with Multi-Task LearningPhoto by TOMOKO UJI on UnsplashIn the first article of this Arabic natural language processing (NLP) series, I introduced a transformer language model named AraBERT (Arabic Bidirectional Encoder Representations from Transformers) released by Antoun et al. In an article written last year, I discussed my interest in Transformer sentence embeddings, an idea I encountered in a research paper that details the training of efficient sentence embeddings from transformer language models. Training Arabic sentence embeddings with Multi-task learningAll code for this tutorial is in Python using the Pytorch framework. Next steps include experimenting with other methods for training Arabic sentence embeddings, and assessing how well SAraBERT is able to summarize Arabic text or perform a semantic search. Fast and efficient Arabic sentence embeddings with SAraBERT make it possible to quickly utilize a variety of NLP techniques that rely on semantic search.
HyperDimension Hypothesis Testing — MultiScale Graph Correlation and Distance Correlation
       
Distance Correlation (Dcorr)The theory was first published in the 2007 annals of Statistics by Gabor J. Szekely and others¹. X and Y are double-centered distance matrices. Double Centered Matrix: For each element in the distance matrix, subtract it by the mean of the column and row, and add the grand mean of the matrix. Image by authorThe distance matrices for X and Y are:X = array([[0., 0., 1., 1. After double-centering, the matrices will become:X_centered = array([[-0.5, -0.5, 0.5, 0.5],[-0.5, -0.5, 0.5, 0.5],[ 0.5, 0.5, -0.5, -0.5],[ 0.5, 0.5, -0.5, -0.5]]) Y_centered = array([[-0.5, 0.5, -0.5, 0.5],[ 0.5, -0.5, 0.5, -0.5],[-0.5, 0.5, -0.5, 0.5],[ 0.5, -0.5, 0.5, -0.5]])The only point that satisfies the condition is the middle of the square.
How to Update Neural Network Models With More Data
       
How to update trained neural network models with just new data or combinations of old and new data. Tutorial OverviewThis tutorial is divided into three parts; they are:Updating Neural Network Models Retraining Update Strategies Update Model on New Data Only Update Model on Old and New Data Ensemble Update Strategies Ensemble Model With Model on New Data Only Ensemble Model With Model on Old and New DataUpdating Neural Network ModelsSelecting and finalizing a deep learning neural network model for a predictive modeling project is just the beginning. These approaches might represent two general themes in updating neural network models in response to new data, they are:Retrain Update Strategies. At the other extreme, a model could be fit on the new data only, discarding the old data and old model. How to update trained neural network models with just new data or combinations of old and new data.
Multimodal Neurons in Artificial Neural Networks
       
discovered that the human brain possesses multimodal neurons. Now, we’re releasing our discovery of the presence of multimodal neurons in CLIP. Our discovery of multimodal neurons in CLIP gives us a clue as to what may be a common mechanism of both synthetic and natural vision systems—abstraction. Indeed, these neurons appear to be extreme examples of “multi-faceted neurons,” neurons that respond to multiple distinct cases, only at a higher level of abstraction. How multimodal neurons composeThese multimodal neurons can give us insight into understanding how CLIP performs classification.
Self-supervised learning: The dark matter of intelligence
       
Today, we’re sharing details on why self-supervised learning may be helpful in unlocking the dark matter of intelligence — and the next frontier of AI. Self-supervised learning is predictive learningSelf-supervised learning obtains supervisory signals from the data itself, often leveraging the underlying structure in the data. As a result of the supervisory signals that inform self-supervised learning, the term “self-supervised learning” is more accepted than the previously used term “unsupervised learning.” Unsupervised learning is an ill-defined and misleading term that suggests that the learning uses no supervision at all. In fact, self-supervised learning is not unsupervised, as it uses far more feedback signals than standard supervised and reinforcement learning methods do. Using self-supervised learning at FacebookAt Facebook, we’re not just advancing self-supervised learning techniques across many domains through fundamental, open scientific research, but we’re also applying this leading-edge work in production to quickly improve the accuracy of content understanding systems in our products that keep people safe on our platforms.
SEER: The start of a more powerful, flexible, and accessible era for computer vision
       
Facebook AI has now brought this self-supervised learning paradigm shift to computer vision. We’ve developed SEER (SElf-supERvised), a new billion-parameter self-supervised computer vision model that can learn from any random group of images on the internet — without the need for careful curation and labeling that goes into most computer vision training today. SEER’s performance demonstrates that self-supervised learning can excel at computer vision tasks in real-world settings. This is a major breakthrough that ultimately clears the path for more flexible, accurate, and adaptable computer vision models in the future. Self-supervised learning has incredible ramifications for the future of computer vision, just as it does in other research fields.
What is Algorithm Fairness?
       
In this article, we will explore the concept of model bias and how it relates to the field of algorithm fairness. What is algorithm fairness? Algorithm fairness is the field of research aimed at understanding and correcting biases like these. Algorithm fairness\ bias is actually a bit of a misleading term. In the future, I hope to write more about algorithm fairness.
An Introduction to Neural Style Transfer for Data Scientists
       
An Introduction to Neural Style Transfer for Data ScientistsPhoto by h heyerlein on UnsplashIf 2Pac was only allowed to release music under the pretence that his style was to match the Queen’s English, the world would have been a significantly worse place. Both sets of researchers are taking advantages of recent developments in the field of Style Transfer. Neural Style Transfer was initially used between images, whereby, a certain composition of an image could be projected onto something similar. Neural Style Transfer between Images [Source]However, this technique has recently been adapted for the use case of text style transfer. This is called an ‘encoder-decoder architecture’ and in this manner, Neural Machine Translation (NMT) can translate local translation problems.
8 Common Pitfalls In Neural Network Training & Workarounds For Them
       
When the learning rate is large, yet not large enough to cause Instability, you will get the large weight updates. 1] Instability due to large learning rate 2] Oscillation due to large learning rateSolution: As discussed above, having a large learning rate results in Instability or Oscillations. Thus, the first solution is to tune the learning rate by gradually decreasing it. While training a Neural Network, we need a small learning rate in direction of high curvature otherwise the gradient may overshoot. Similarly, in the direction of low curvature, we need a large learning rate so that we reach optimum quickly.
Testing Best Practices for Machine Learning Libraries
       
Testing Best Practices for Machine Learning LibrariesPhoto by Kevin Ku from unsplashDisclaimer: You won’t be able to fit everything to the proposed structure, apply common sense and your judgement during the testing developing and design. It defaults to “function”, so the fixture, spark instance in our case, will be created for each test function. The most direct way is to call$ pytest tests/ or $ pytest tests/test_language_detection.py tests/test_something.py or $ pytest tests/test_language_detection.py::test_detect_language tests/test_something.pyFor more info on specifying tests/selecting tests, please refer to the official documentation. Common Testing StrategiesIn this section, we will discuss common testing strategies we developed in testing our internal libraries. In one run, the test function will be called three times, so that we can get the result for each testcase separately.
Relevance Ranking Simplified
       
Cosine SimilarityCosine Similarity is a metric to measure the similarity between two vectors. Mathematical speaking, it measures the cosine of the angle between two vectors in a multi-dimensional space. Image by authorThus it is a measure of direction and not magnitude. The cosine similarity is particularly used in positive space, where the result is bounded in the range of 0 until 1. Cosine similarity is given by:Image by authorAnd the positive space is given by:Image by authorIn the case of text relevance ranking, cosine similarity should be bounded between 0 and 1 since the TF-IDF matrix doesn't contain any negative value.
Sensitivity and Specificity, explained! — with Zombies?!
       
The specificity now allows us to compute the number of true negatives which is simply the number of negatives in the reference. So the false negatives are associated with sensitivity meaning if we have a low sensitivity that means that we are not very sensitive to zombies. So we think people are zombies that aren’t actually zombies. So almost half of the detected zombies aren’t actually zombies. In particular, you should think about varying the sensitivity and the specificity and think about how this would change the number of false positives and false negatives.
Land Cover Classification of Satellite Imagery using Python
       
Here is the brief information:Sentinel-2 Satellite Image, Source — ESAThe Sentinel-2 mission consists of two satellites developed to support vegetation, land cover, and environmental monitoring. In this article, we are going to use a part of the Sundarbans satellite data which is acquired using the Sentinel-2 Satellite on 27 January 2020. Let’s start coding..,Read Datalet’s read the 12 bands using rasterio and stack them into an n-dimensional array using numpy.stack() method. The ground truth of the satellite image is read using the loadmat method from the scipy.io package. The below figure shows the composite image and the ground truth of the Sundarbans satellite data.
What Makes College Worth It?
       
What Makes College Worth It? GIF by Giphy.comIntroductionLike most Asian immigrants, my parents emphasized the importance of school, with college graduation as the gateway to the American dream. According to a survey conducted by LendEDU, college students expected a median salary of $60,000 after graduating. 52% of college students took out student loans to attend college, and of that number, 17% had trouble paying for it. GIF by Giphy.comThis made me question: what makes college worth attending?
ElegantRL: A Lightweight and Stable Deep Reinforcement Learning Library
       
This article by Xiao-Yang Liu, Steven Li and Yiyan Zeng describes the ElegantRL library (Twitter and Github). Starting from the success of AlphaGo, various DRL algorithms and applications are emerging in a disruptive manner. The ElegantRL library enables researchers and practitioners to pipeline the disruptive “design, development and deployment” of DRL technology. Stable: more stable than Stable Baseline 3. The ElegantRL implements DRL algorithms under the Actor-Critic framework, where an Agent (a.k.a, a DRL algorithm) consists of an Actor network and a Critic network.
Better reliability of deep learning classification models by exploiting hierarchy of labels
       
Better reliability of deep learning classification models by exploiting hierarchy of labelsWorking with predictive deep learning models in practice often comes with the question how far we can trust the predictions of that model. The goal is to build a second model, trained on the parent labels in the hierarchy and this post will show why this increases the reliability of models and at what cost. A flat label structureMost research for new model architectures and in general better models is based on flat labels, discarding the hierarchy from the datasets. One assumption is that a plausibility model benefits from the larger amount of training data per class and generalizes more, therefore overfitting is prevented. ConclusionUsing a plausibility model, where hierarchy information is available in the data, can make predictions more reliable.
Exploit Your Hyperparameters: Batch Size and Learning Rate as Regularization
       
Reducing your learning rate guarantees you get deeper into one of those low points, but it will not stop you from dropping into a random sub-optimal hole. And it likely overfits to your training data, meaning it will not generalize to the real world. An intuitive way to think about this is that a narrow hole is likely specific to your training data. There are many regularization techniques like dropout, dataset augmentation, and distillation. Our tools that are always present, learning rate and batch size, can perform a degree of regularization for us.
Deep Q Network: Combining Deep & Reinforcement Learning
       
Deep Q Network: Combining Deep & Reinforcement LearningReinforcement Learning (RL) is one of the most exciting research areas of Data Science. In this context, the combination of the Reinforcement Learning approach and Deep Learning models, generally referred to as Deep RL, has proven to be powerful. What does Deep Learning bring to Reinforcement Learning? Fortunately, by combining the Q-Learning approach with Deep Learning models, Deep RL overcomes this issue. Deep Q-Learning pipeline, inspired by this articleThe graph below compares Q-Learning and Deep Q-Learning approaches:Q-Learning vs.
Integrating Amazon Polly with legacy IVR systems by converting output to WAV format
       
Amazon Polly, an AI generated text-to-speech service, enables you to automate and scale your interactive voice solutions, helping to improve productivity and reduce costs. This post shows you how to convert Amazon Polly output to a common audio format like WAV. Converting Amazon Polly file output to WAVOne of the challenges with legacy systems is that they may not support Amazon Polly file outputs like MP3. The output of the Amazon Polly SynthesizeSpeech API call doesn’t support WAV, but some legacy IVRs obtain the audio output in WAV file format, which isn’t supported natively in Amazon Polly. The following sample code which will help in such situations where audio is in WAV file format not supported natively in Amazon Polly.
Utilizing XGBoost training reports to improve your models
       
This task is made easier with the newly launched XGBoost training report feature. For more information, see Debugger XGBoost Training Report Walkthrough. ConclusionIn this post, we generated an XGBoost training report and profiler report using SageMaker Debugger. We then walked through the XGBoost training report and identified a number of issues that we can alleviate with some hyperparameter tuning. For more about SageMaker Debugger, see SageMaker Debugger XGBoost Training Report and SageMaker Debugger Profiling Report.
The Ultimate Performance Metric in NLP
       
For ROUGE-1 we would be measuring the match-rate of unigrams between our model output and reference. Once we have decided which N to use — we now decide on whether we’d like to calculate the ROUGE recall, precision, or F1 score. ROUGE-LROUGE-L measures the longest common subsequence (LCS) between our model output and reference. So, if we took the bigram “the fox”, our original ROUGE-2 metric would only match this if this exact sequence was found in the model output. Nonetheless, it’s a good metric for assessing both machine translation and automatic summarization tasks and is very popular for both.
Machine Learning Goes Quantum: A Glance at an Exciting Paradigm Shift
       
Machine Learning Goes Quantum: A Glance at an Exciting Paradigm ShiftQuantum computing is a buzz-word that’s been thrown around quite a bit. As a very new field, quantum computing presents a complete paradigm shift to the traditional model of classical computing. Although quantum computing still has a long way to go, machine learning is an especially promising potential avenue. Quantum machine learning is catching on. This article will introduce quantum variations of three machine learning methods and algorithms: transfer learning, k-means, and the convolutional neural network.
Segmenting Abnormalities in Mammograms (Part 2 of 3)
       
It is a pretty long read, but you will leave with extensive knowledge of general image preprocessing methods and reusable code that you can add to your own image preprocessing “toolbox”. Overview of Image Preprocessing PipelineFig 1. An animated sample of 6 images going through the image preprocessing pipeline, showing the changes at each step. Image preprocessing for images is the equivalent of “data cleaning” for other types of data that we are familiar with (tabular data, text streams, etc.). In this project, our image preprocessing pipeline consists of the steps illustrated in Fig.
One day in an ordinary clustering day
       
Silhouette scoreSilhouette score measures how close each point in one cluster is to points in the neighboring clusters . The silhouette score ranges from -1 to +1 . The reference data is generated using Monte Carlo simulations of the sampling. Weights Gap values (Gap*)Here we can notice that 6 , 7 ,15 and 17 number of clusters have similar score. And now the question emerged one more time , how are we going to define the number of clusters ?
Decrypting complex SQL
       
What can make SQL trickyWhen we look a new SQL query the majors obstacles to our full understanding areIntensive use of sub-queries and nested logicUnknown clauses, functions, syntaxes, database-specific commandsPoorly structured or poorly documented codeSo, supposing that we need to understand a SQL query quickly, can we do anything to optimize our SQL analysis approach? Breaking SQL complexityIn this section I will present the 4 main things that can make our SQL analysis more efficient. Split and analyse inner to outer SQLI will explain each concept and apply it to a SQL example, which actually isn’t a very complex nor a long one, but the very same ideas apply to any SQL query. High level SQL review2.1 Identify the main SQL layerAny query can be seen as composed by layers of other queries, and sub-subqueries. Screenshot by Author: example of SQL analysisIn a nutshell…In this story I presented a very effective way to analyze complex SQL queries.
Using AI? You May Want to Start Conducting Human Rights Impact Assessments (HRIA)
       
You May Want to Start Conducting Human Rights Impact Assessments (HRIA)Image: Photo by Tolu Olubode on UnsplashIn a recent post I offered a few predictions on how the Biden Administration may change the AI playing field. Given recent developments, though, you may need to add another tool to your AI governance toolbox: a Human Rights Impact Assessment (HRIA). · Indivisible and interdependent: All human rights have equal status; “one set of rights cannot be enjoyed fully without the other”; and violation of one right may negatively impact other rights. Human rights were codified in the Universal Declaration of Human Rights (UDHR), adopted unanimously by the UN General Assembly in 1948, and in subsequent documents that together make up the International Bill of Rights. Access Now argues that “the burden of proof [should] be on the entity wanting to develop or deploy the AI system to demonstrate that it does not violate human rights via a mandatory human rights impact assessment (HRIA).
5 YouTube Playlists That Teach You All About Data Science
       
№1: Data Science 101 By Data ProfessorThe first playlist on this list covers data science basics, the Data Science 101 playlist by the Data Professor. №2: Intro to Data Science By Steve BruntonOne amazing playlist to learn all the fundamentals of data science and machine learning is the Intro to Data Science playlist by Steve Brunton and his team. №3: CSE 519 — Data Science Fundamentals BySteven SkienaThis playlist contains the data science course taught by Steven Sol Skiena at Stony Brook University. The Data Science playlist includes 10 independent videos, some longer than an hour, that focus on some aspect of data science. №5: Data Science Full Course For Beginners ByCodebasicsLast but not least on the list is the Data Science Course playlist by Codebasics.
Supercharge R Functions with C++
       
R has always been a quite spectacular language for statisticians, as it has been optimized with statistics in mind. In this article, I will outline how to leverage C++ in R to allow you to supercharge your code by orders of magnitude using Rcpp. As the name implies Rcpp acts as an interface between R and C++. Not only does it allow R users to take functions they have already built out in pure R and build it in C++ for even faster results, but it also enables your C++ functions to be easily exported and used in your R scripts. In this article, I will aim to show you some examples of how to use C++ in your R code and make it even faster than before.
Graph Transformer: A Generalization of Transformers to Graphs
       
Key Design Aspects for Graph TransformerWe find that attention using graph sparsity and positional encodings are two key design aspects for the generalization of transformers to arbitrary graphs. Now, we discuss these from the contexts of both NLP and graphs to make the proposed extensions clear for Graph Transformer. We extend this critical design block of positional information encoding for Graph Transformer. We therefore leverage the success of the recent works on positional information in GNNs and use Laplacian Positional Encodings [8] in Graph Transformer. Hence, sparse graph structure during attention and positional encodings at the inputs are the two important things we consider while generalizing transformers to arbitrary graphs.
Write a custom training routine for your Keras model
       
The dataset contains 70000 images from ten object classes, with a pre-defined split between 60000 training images and 10000 validation images. Some example images of the FashionMNIST data setClassical KerasBuilding a neural network using Keras is super straightforward. Before the model can be trained, Keras requires us to specify some details about the training process like the optimizer, and a loss function. Training a neural network involves minimizing the loss function over a large dataset of training examples. Tensorflow allows us to use the same model built using Keras API functions for the custom training loop.
Deep learning with Python for crack detection
       
Deep learning with Python for crack detectionProblem statementWhile new technologies have changed almost every aspect of our lives, the construction field seems to be struggling to catch up. Codes, data, and networks relevant to the implementation of the Deep Learning models can be found on my GitHub Repository [2]. Dataset preparationThe most important part of training a Deep Learning model is the data; the accuracy of a model heavily relies on the quality and amount of data. Crack detection with 3D scene reconstruction (Image by author)So, follow me to stay updated! [2] Crack detection for masonry surfaces: GitHub Repository https://github.com/dimitrisdais/crack_detection_CNN_masonry[3] https://github.com/qubvel/segmentation_models
An Attack on Deep Learning
       
An Attack on Deep LearningCredit: PixabayDeep learning has become ubiquitous with data science and an inextricable element of machine learning. (I’ll circle back later on one more reason why big tech is incentivized to propagate widespread deep learning enthusiasm.) This brings us to the real value proposition of cloud-based machine learning: Both Google and Amazon have their respective out of the box, cloud based, deep learning driven, machine learning products. A common mitigation to the novel task problem for deep learning is in the usage of transfer learning. This form of machine learning (and in these two examples, deep learning) does not require a training dataset.
Comparing Keras and PyTorch syntaxes
       
Comparing Keras and PyTorch syntaxesPhoto by cottonbro from PexelsKeras and PyTorch are popular frameworks for building programs with deep learning. There are similar abstraction layers developped on top of PyTorch, such as PyTorch Ignite or PyTorch lightning. After comparing syntaxes in this article, I will demonstrate a practical example on sentiment classification comparing both frameworks in a subsequent article I will publish later this month. Actually, we still need to “compile” the model like in the Keras example. For the sake of completeness, I share some resources I found covering a comparison between Keras and PyTorch.
Lithology Prediction Using Deep Learning: Force 2020 Dataset: Part.1 (data visualization)
       
Lithology Prediction Using Deep Learning: Force 2020 Dataset: Part.1 (data visualization)The objective of this competition was to predict lithology labels from well logs, provided NDP lithostratigraphy and well X, Y position. In this work, it is attempted to have a standard approach, like other Machine Learning problems, to improve prediction scores using Deep Learning methodology. For comparability, we will consider the same wells as train and test wells in competition. Well Data Distribution:In the figure below, geographical well locations are plotted using X and Y coordinates in the circle shape. We can see that test wells are chosen fairly consistent with whole data points distribution.
Step-by-step implementation of GANs on custom image data in PyTorch: Part 2
       
Step-by-step implementation of GANs on custom image data in PyTorch: Part 2Learn about the different layers that go into a GAN’s architecture, debug some common runtime errors, and develop in-depth intuition behind writing code in PyTorch. In case you would like to follow along, here is the Github Notebook containing the source code for training GANs using the PyTorch framework. Preparing the image datasetOne of the main reasons I started writing this article was because I wanted to try coding GANs on a custom image dataset. Preparing custom dataset classI know what you’re thinking — why do I need to create a special class for my dataset? In case you need further help creating the dataset class, do check out the PyTorch documentation here.
Tesseract OCR for Text Localisation and Detection
       
Tesseract OCR for Text Localisation and DetectionOptical character recognition (“OCR”) systems have been widely used to provide automated text entry into computerised systems. OCR systems transform a two-dimensional image of text, that could contain machine printed or handwritten text from its image representation into machine-readable text. What is text localisation and detection? Text detection is the process of localising where an image text is. The idea of text detection can be thought of as a specialised form of object detection.
Simple Genetic Algorithm From Scratch in Python
       
In this tutorial, you will discover the genetic algorithm optimization algorithm. Tutorial OverviewThis tutorial is divided into four parts; they are:Genetic Algorithm Genetic Algorithm From Scratch Genetic Algorithm for OneMax Genetic Algorithm for Continuous Function OptimizationGenetic AlgorithmThe Genetic Algorithm is a stochastic global search optimization algorithm. Now that we are familiar with the simple genetic algorithm procedure, let’s look at how we might implement it from scratch. Genetic Algorithm From ScratchIn this section, we will develop an implementation of the genetic algorithm. Genetic Algorithm for OneMaxIn this section, we will apply the genetic algorithm to a binary string-based optimization problem.
Analyzing open-source ML pipeline models in real time using Amazon SageMaker Debugger
       
SageMaker Debugger offers the capability to debug ML models during training by identifying and detecting problems with the models in near-real time. This feature can be used when training models within Kubeflow Pipelines through the SageMaker Training component. For more information, see Amazon SageMaker Debugger – Debug Your Machine Learning Models. Using SageMaker Debugger for Kubeflow Pipelines with XGBoostThis post demonstrates how adding additional parameters to configure the debugger component can allow us to easily find issues within a model. Using SageMaker Debugger in your Kubeflow Pipelines lets you go beyond just looking at scalars like losses and accuracies during training.
Load Testing a ML Model API
       
And, if you hadn’t guessed by the title of this page, that’s the subject of this post: load testing ML model APIs. Load testing is a form of non-functional testing: it’s aimed at testing system behaviours, and not the specifics of the functionality of the system. Load testing in practiceHopefully you’re now sold on the utility of load testing. For example, when defining user behaviours for testing a ML model API, it might be tempting to load historical data as sample payloads to test your API with. With that, you’ve got a basic example of using Locust for load testing a model API.
Literature search skills will help you to become a more effective Data Scientist
       
Literature search skills will help you to become a more effective Data ScientistMany newcomers to the Data Science field often look for ready-made recipes to follow that can be applied to any machine learning task. And one thing that is very often missing from countless Data Science courses available on the market is the importance of learning to explore research literature. Yes — developing literature search skills will help you to become a more effective Data Scientist. It is not a coincidence that many experts believe that domain knowledge is one of the most important skills in Data Science. If you already know the field, in which you are planning to apply data science techniques — great!
Using Machine Learning to Predict if a F1 Driver Will Score
       
The first thing I did was to merge all these datasets to create one with all the necessary information I needed. With the base dataset ready, I started to verify the fields and how the data was distributed in then. The first thing I did was to check how driver’s positions in the race were distributed, the result was the chart below. Having that in mind in combination with, what I judged fewer data to predict the position of the race, I realized that I was only able to predict if the driver would score in the race or not, i.e. the drive would finish in the first 10 positions, so one more column was created, our target, 1 if the driver will score and 0 if not.
When should we use the log-linear model?
       
When should we use the log-linear model? Log-linear modelThe vastly utilized model that can be reduced to a linear model is the log-linear model described by below functional form:The difference between the log-linear and linear model lies in the fact, that in the log-linear model the dependent variable is a product, instead of a sum, of independent variables. If we are using a log-linear model, we must remember that we are calculating the logarithms of dependent and independent variables. The question that arises is what kind of distribution should we observe in our variables to consider using a log-linear model. Thus we see that in practice we should use a log-linear model when dependent and independent variables have lognormal distributions.
The Flawless Pipes of Tidyverse
       
The Flawless Pipes of TidyversePhoto by Vikas Gurjar on UnsplashPython and R are the programming languages that dominate the field of data science. In this article, we will focus on Tidyverse, a collection of R packages for data science. Tidyverse contains several packages for data analysis, manipulation, and visualization. We will go over several examples that demonstrate how pipes can combine data manipulation and analysis steps. rows) based on distance and the number of rooms.
What Einstein Can Teach Us About Machine Learning
       
[Photo by Damian McCoig on Unsplash]Rapid progress has been made in machine learning over the past decade, particularly for problems involving complex high dimensional data, such as those in computer vision or natural language processing. Whereas a young child may learn to recognise a new animal from just a handful of examples, a modern machine learning system may require hundreds or even thousands of examples to achieve the same feat. Symmetry in machine learningMachine learning practitioners are well aware of the importance of placing constraints on models to control the bias-variance tradeoff. One particularly effective way to introduce inductive biases into machine learning models to address this issue — which at this point should come as no surprise — is to leverage principles of symmetry! Integrating symmetry into machine learning for planar images and beyondThe integration of translational symmetry into machine learning models is one of the key factors responsible for driving the revolutionary advances seen in computer vision over the past decade (combined with the proliferation of data and compute power).
VirtualDataLab: A Python library for measuring the quality of your synthetic sequential dataset
       
[1]Indeed, synthetic data is more and more popular — I see this every day, working at a synthetic data company. Synthetic data is created with a synthetic data generator. Best practices for picking a synthetic data generatorTypical ways to measure quality can range from looking at summary statistics or using the synthetic data in a downstream machine learning task. There is no shortage of tools or how-to-guides on creating synthetic data, but very little on how exactly to measure the utility/privacy aspect of the synthetic data. Generating synthetic data and comparing it to the original synthetic data all in less than 5 lines of code.
The Fastest Way to Deploy Your ML App on AWS with Zero Best Practices
       
The Fastest Way to Deploy Your ML App on AWS with Zero Best PracticesYou’ve been working on your ML app and a live demo is coming up fast. Now you only have 1 hour before the presentation and your ML app needs to be available on the internet. Photo by Braden Collum on UnsplashThis tutorial is the “zero best practices” way to create a public endpoint for your model on AWS. You’ll need:The AWS console A terminal Your ML appIf you don’t have an ML app and just want to follow along, here’s the one I wrote this morning. Again if you don’t have an ML app, clone my slapdash one, and cd into the directory.
Generating your own Images with NVIDIA StyleGAN2-ADA for PyTorch on Ampere
       
We will install StyleGAN2 outside of WSL2 or Docker. First, make sure you have the latest NVIDIA driver for your graphics card:Second, install the latest version of CUDA 11. Here I am converting all of the JPEG images that I obtained to train a GAN to generate images of fish. To generate images from this network, the following command is used. ConclusionNVIDIA StyleGAN2 ADA is a great way to generate your own images if you have the hardware for training.
Get Your Hands on Interesting Machine Learning Projects
       
Get Your Hands on Interesting Machine Learning ProjectsPhoto by Jules Amé from PexelsWe’re now living in an age that has become increasingly dominated by emerging technologies like Machine Learning, Artificial Intelligence, the Internet of Things, and many others like them. Machine Learning, in particular, has impacted all of our lives and made it a tad bit smarter in more ways than we can imagine. Thanks to its skyrocketed popularity, most of the digital services that we use today, rely on Machine Learning in one way or the other, making it one of the best and most useful technologies to work with. If you’re also impressed with the capabilities of Machine Learning and have started learning about it, why not take it a step further and also include a more practical approach along with the regular theoretical approach? In this article, we will share more than 10 Machine Learning projects that will give you a more hands-on experience of the technology, essentially speeding up your learning.
Single-node and distributed Deep Learning on Databricks
       
Distributed Deep LearningWe have seen the value of single-node Databricks clusters for machine learning when we run Databricks notebooks on single-nodes with adequate memory and GPU/CPU resources. Spark-Deep-Learning by Databricks supports Horovod on Databricks clusters with the Machine Learning runtime. This library enables the use of data lake stored Parquet files during single-node and distributed training of deep learning models. Distributed data on HDFS-like filesystems can be enabled for distributed training of Deep Learning models. We moved from running Deep Learning training in parallel on single-node clusters to running distributed training on multi-node clusters.
The world’s first large scale medical AI in Production — eye diagnosis by Deepmind
       
The world’s first large scale medical AI in Production — eye diagnosis by DeepmindPutting an AI system into the world’s oldest eye hospital and the biggest one in Europe and North America — Moorefield’s eye hospital Mostafa Ibrahim 1 day ago·7 min readPhoto by v2osk on UnsplashIf you have been following me for a while, you know that I am heavily in medical AI. Not too long ago, Deepmind released an AI system that automates the diagnosis of specific eye diseases at Moorefield’s hospital in the UK. I was asked to implement an AI system that diagnosis Glaucoma (an eye-disease) at an actual Hospital. Because I think AI research is at a very good place, while production AI just isn’t. Doctors can look at those heatmaps in the AI system as an indication that the AI system really understands the different parts of the OCT scan.
Text Summarization through use of Spacy library
       
Text Summarization can be of various types like:1) Based on Input Type: It can be single or multiple documents from which the text needs to be summarized. 2) Based on Purpose: What is the purpose of summarization, does the person need answers to queries, or domain-specific summarization, or generic summarization. Steps to Text Summarization:1) Text Cleaning: Removing stop words, punctuation marks and making the words in lower case. 3) Word Frequency table: Count the frequency of each word and then divide the maximum frequency with each frequency to get the normalized word frequency count. stopwords=list(STOP_WORDS)from string import punctuationpunctuation=punctuation+ ''Store a text in variable text from which we need to summarise the text.
Improving The Inference Speed of TabNet
       
Improving The Inference Speed of TabNetTabNet[1] is a deep neural network (dnn) based model for tabular data sets. The authors of TabNet claim that dnn has been successful for image data, sequential data (e.g., texts), but when it comes to tabular data, it performs worse than gradient boosting models such as LGBM or XGBM. They can develop a novel dnn based model for tabular data sets and show that TabNet performs significantly better than gradient boosting models. Despite TabNet’s performance, it has one weakness — slow inference speed. I will first explain why inference speed is slow (if you are not interested, you can jump to the next next paragraph).
What I’ve learned in my career as a Data Scientist
       
What I’ve learned in my career as a Data ScientistWhat career advice would I give to myself If I could go back in time? While responding, I always do a bit of retrospective on my career in Data Science. This article is a write-up of lessons learned through my professional career. Eight years ago…Photo by Jon Tyson on UnsplashI started my professional career in 2013 when I got a Data Scientist internship in a research institute — a dream come true. The important lesson that I learned was that building relationship is more important than the project you are working on.
Your MCAR Data Technique Guide
       
Mode Imputation (Univariate Data)This technique involves filling in the missing data values in a single column with the mode of the non-null data in that column. This can also disrupt the relationship of the data column with other data columns in your data set (i.e. This is actually a flexible technique since you can do this regardless of whether the data is MCAR or another type of missing data. This can become an increasing issue the larger your data set is and/or the percentage of missing values in your data column. You are adding a new data column for every column of missing data.
Does your Machine Learning pipeline have a pulse?
       
Does your Machine Learning pipeline have a pulse? Image by Arek Socha from PixabayThe process of building and training Machine Learning models is always in the spotlight. In 2015, Google published a seminal paper called the Hidden Technical Debt in Machine Learning Systems. To get started, see the story below:About the AuthorMy name is Dimitris Poulopoulos, and I’m a machine learning engineer working for Arrikto. If you are interested in reading more posts about Machine Learning, Deep Learning, Data Science, and DataOps, follow me on Medium, LinkedIn, or @james2pl on Twitter.
5 Tools to Detect and Eliminate Bias in Your Machine Learning Models
       
Although machine learning has many advantages, if your machine learning model contains any type of bias, you’ll not be able to harness its full potential. That’s why detecting bias in machine learning models has been the focus of many researchers over the past couple of years. This research has developed some tools that you can use to check if your machine learning model is biased. This article will take you through 5 tools that can help you detect and mitigate bias in your next machine learning model. FairML is a Python open-source toolbox that is used to audit machine learning predictive models to detect bias.
The Evolution of AI — Plus 3 Traits Needed to Survive the Next Mass Extinction
       
Innovations in data science and AI abound and are already changing the very nature of business and life. Their definition of the current era we live in, what we are calling the MLOps era, is constrained. The MLOps EraAs we begin 2021, the data science, ML, and AI industry is currently in the early days of the MLOps era. It seeks to solve the last mile problem of getting more data science products into production — to operationalize ML and AI. Next StepsOrganizations should embrace the current MLOps era while simultaneously laying the foundation for the next.
SHAP Values for Model Interpretation
       
SHAP Values for Model InterpretationSource (Unsplash)In an increasing number of domains, machine learning models have started being held to a higher standard. Here we’ll look at SHAP values, a powerful way to explain predictions that come from a machine learning model. There are many other ways to visualize SHAP values from a model; these examples are just to get you started. We stepped through an example calculation of SHAP values by looking at a model that determines the price of a house. We also looked at the shap library in Python to be able to quickly compute and visualize SHAP values.
Single-node and distributed Deep Learning on Databricks
       
Distributed Deep LearningWe have seen the value of single-node Databricks clusters for machine learning when we run Databricks notebooks on single-nodes with adequate memory and GPU/CPU resources. Spark-Deep-Learning by Databricks supports Horovod on Databricks clusters with the Machine Learning runtime. This library enables the use of data lake stored Parquet files during single-node and distributed training of deep learning models. Distributed data on HDFS-like filesystems can be enabled for distributed training of Deep Learning models. We moved from running Deep Learning training in parallel on single-node clusters to running distributed training on multi-node clusters.
The world’s first large scale medical AI in Production — eye diagnosis by Deepmind
       
The world’s first large scale medical AI in Production — eye diagnosis by DeepmindPutting an AI system into the world’s oldest eye hospital and the biggest one in Europe and North America — Moorefield’s eye hospital Mostafa Ibrahim 19 hours ago·7 min readPhoto by v2osk on UnsplashIf you have been following me for a while, you know that I am heavily in medical AI. Not too long ago, Deepmind released an AI system that automates the diagnosis of specific eye diseases at Moorefield’s hospital in the UK. I was asked to implement an AI system that diagnosis Glaucoma (an eye-disease) at an actual Hospital. Because I think AI research is at a very good place, while production AI just isn’t. Doctors can look at those heatmaps in the AI system as an indication that the AI system really understands the different parts of the OCT scan.
Recurrent Neural Nets for Audio Classification
       
Recurrent Neural Nets for Audio ClassificationImage by the AuthorRecurrent Neural NetsRNNs or Recurrent Neural nets are a type of deep learning algorithm that can remember sequences. Unless the audio is a random stream of garbage (not the band), audio information tends to follow a pattern. A sample RNN model architectureImage by AuthorIn tensorflow, you can create the above RNN model like this:input_shape=(128,1000)model = keras.Sequential()model.add(LSTM(128,input_shape=input_shape))model.add(Dropout(0.2))model.add(Dense(128, activation='relu'))model.add(Dense(64, activation='relu'))model.add(Dropout(0.4))model.add(Dense(48, activation='relu'))model.add(Dropout(0.4))model.add(Dense(24, activation='softmax'))model.summary()The activation functions add nonlinearity to the model. That’s the case for this classification problem where each audio sample belongs to one species. Look for discrepancies in performance between the training and test data and add Dropout layers to prevent overfitting to the training data.
How do you remain on top of Machine Learning research and best practices?
       
How do you remain on top of Machine Learning research and best practices? Purvanshi Mehta ·21 hours agoTowards Data Science and I would love to hear from you and the community. We wanted to ask you:How do you remain on top of Machine Learning literature and best practices? Are you overwhelmed by the number of papers coming out everyday and with every conference? If so, how do you choose what papers to concentrate on?
Generative Networks: From AE to VAE to GAN to CycleGAN
       
The prior distribution captures our assumptions about how the data might be distributed. This divergence comes from the probability theory and is used in deriving parameters for a probability distribution (e.g. This fits the VAE setting, where these parameters have to be learned. The noise input is a prior on the true data distribution, akin to the Gaussian prior in the VAE setting. This transformative function is a complex function (meaning “not simple”), which neural networks have shown to learn exceedingly well; the networks weights are the function’s parameters.
What is a Tensor Processing Unit (TPU) and how does it work?
       
What is a Tensor Processing Unit (TPU) and how does it work? And during the last few years, we can see that there are new chips being developed by giants in the industry such as Nvidia and ARM to optimize machine learning tensor (matrix) operations which brings us to Tensor Processing Units (TPUs). TPUs vs GPUsPhoto by Nana Dua on UnsplashAlthough TPUs and GPUs perform tensor operations, TPUs are more oriented in performing large tensor operations which are frequently [2] present in neural network training compared to 3D graphics rendering. Source: ColabAnother interesting concept to speed up tensor operations is the so-called “systolic array”. Another good point to note here is that when you are using Colab/Kaggle’s TPU you aren’t only using one TPU core, you are actually using quite a few.
Differential Evolution Global Optimization With Python
       
How to use the Differential Evolution optimization algorithm API in python. Tutorial OverviewThis tutorial is divided into three parts; they are:Differential Evolution Differential Evolution API Differential Evolution Worked ExampleDifferential EvolutionDifferential Evolution, or DE for short, is a stochastic global search optimization algorithm. Differential Evolution APIThe Differential Evolution global optimization algorithm is available in Python via the differential_evolution() SciPy function. Differential Evolution Worked ExampleIn this section, we will look at an example of using the differential evolution algorithm on a challenging objective function. How to use the Differential Evolution optimization algorithm API in python.
Kernel Machine From Scratch
       
Function Space in Machine LearningTo fully appreciate the theory behind the kernel method, it’s essential to clarify the concept of function space and the role it plays in machine learning in the first place. Such a set of functions, known as hypothesis set in the machine learning field, is exactly a function space. Function space of f: ℝ → ℝ with linear / polynomial functional form. Consider a kernel function K: ? × ? → ℝ satisfying inner product properties. To build up a function space from K, we can span a function space withas basis, as shown in the figure below.
SMOTE: Synthetic Data Augmentation for Tabular Data
       
An alternative to ADASYN is K-Means-SMOTE which generates synthetic samples based on the density of each cluster found in the minority class. SMOTE in practiceIn this section, we will see the SMOTE [2] implementation and its variants (Borderline-SMOTE [3] and ADASYN [4]) using the python library imbalanced-learn [1]. Class imbalance | Image by authorThe implementation of SMOTE, Borderline-SMOTE and ADASYN is relatively simple thanks to the imbalanced-learn library. These hyperparameters will depend on the size of the dataset, the class imbalance ratio, the number of samples in the minority class, etc. Figures 6, 7 and 8 show the visualizations of the implementation of the SMOTE, Borderline-SMOTE and ADASYN algorithms respectively.
NFNets Explained — DeepMind’s New State-Of-The-Art Image Classifier
       
NFNets Explained — DeepMind’s New State-Of-The-Art Image ClassifierIntroductionDeepMind has recently released a new family of image classifiers that achieved a new state-of-the-art accuracy on the ImageNet dataset. Batch Normalization — The GoodFirst, let’s understand the benefits that batch normalization brings. Batch normalization has a regularizing effectWith the introduction of batch normalization, researchers found that dropout was no longer necessary as applying batch normalization helps to regularize the network. Batch normalization allows efficient large-batch trainingBy smoothing the loss landscape, batch normalization allows us to use a larger batch size and training rate without overfitting. Batch Normalization — The BadEven though batch normalization has enabled image classifiers to make substantial gains in recent years, it does have many negative consequences.
How to Visualize Decision Tree from a Random Forest Model?
       
How to Visualize Decision Tree from a Random Forest Model? The Random Forest model can be considered as a black-box model, as the cause of predictions is very difficult to interpret. This library is used to export the decision tree in DOT format and generates a GraphViz representation of a decision tree. To visualize the decision tree of a random forest, follow the steps:Load the datasetTrain Random Forest Classifier model with n_estimator parameters as a number of base learners (decision trees). This technique works best to interpret the model to business folks for random forest model having less number of decision trees.
Dealing with Overconfidence in Neural Networks: Bayesian Approach
       
In this post, I explore a Bayesian method for dealing with overconfident predictions for inputs far away from training data in neural networks. Decreasing σ² causes the predictions to be more and more similar to the softmax predictions. When does the LLLA model produce high or low confidence predictions? Even with a threshold value of 0.5, the LLLA model is more than 95% accurate on the validation set. Also, using the 0.5 threshold with the LLLA model excludes all Simpsons characters discussed in the previous section, whereas the softmax model will be mostly unchanged.
Intro to Regularization With Ridge And Lasso Regression with Sklearn
       
Regularization With RidgeBoth problems related to bias and overfitting can be elegantly solved using Ridge and Lasso Regression. This is called the Ridge Regression penalty. This way, ridge regression gets to make important features more pronounced and shrink unimportant ones close to 0 which leads to a more simplified model. It is true but starting with a slightly worse fit, Ridge and Lasso provide better and more consistent predictions in the long run. Using a subset of features, we will predict house prices using Ridge in this section and Lasso in the next one.
Understanding Audio: What sound is and how we can leverage it.
       
Understanding Audio: What sound is and how we can leverage it. Sound intensity describes the sound power a sound displaces over an area, measured in Watts per square meter. We can use AI to intelligently manipulate digital audio to improve existing audio for our needs. Audio super resolution, another facet for audio enhancement, allows us to dramatically enhance low quality audio, by increasing it’s fidelity. Understanding the properties of sound and waveforms is crucial for building better AI sound processing algorithms.
Four Deep Learning Papers to Read in March 2021
       
Four Deep Learning Papers to Read in March 2021Welcome to the end of February edition of the ‚Machine-Learning-Collage‘ series, where I provide an overview of the different Deep Learning research streams. So without further ado: Here are my four favourite papers that I read in February 2021 and why I believe them to be important for the future of Deep Learning. (2017) | ? Paper | ? CodeOne Paragraph Summary: Backpropagation is the driving force behind the current Deep Learning revolution. The framework of synthetic gradients (SG) tries to ‚unlock‘ forward and backward pass by using a surrogate model of the error gradient. The model is learned simultaneously and acts as a drop-in replacement for the backprop gradient.
GPT-3 Navigates the New York Subway
       
Unfortunately, given prompts on TTC subway trips, GPT-3 couldn’t even get single line trips right. This week I finally found a candidate that had a decent chance of looming large in GPT-3's training: the New York subway. Testing out GPT-3 on the New York subway systemThe New York subway comes in for some criticism these days because of lack of maintenance investment. For the purpose of reusing my GPT-3 harness, the New York subway was perfect — a big network with an interesting topology that had a decent chance of good representation in the training corpus of GPT-3. The experiments with the London Underground and the New York subway show me that GPT-3 does not produce spectacular results when it comes to navigating spatial problems.
Demystifying Neural Networks
       
Neural Network DiagramOf course, the above diagram is extremely simplified, and real-world applications can get pretty complex. It only has one input layer and one output layer and no hidden layers. The output layer needs to have the same dimensions as y, (100,1) in this example. The gradient descent algorithm updates the parameters in the direction of the negative gradient (down along the loss function). The gradient is the derivative of the loss function with respect to each of the weights (partial derivatives).
Written Communication: The Other Data Science Skill You Need
       
Written Communication: The Other Data Science Skill You NeedPhoto by Glenn Carstens-Peters on UnsplashNone of the Data Science technical skills are as valuable as communication. It’s great if you were able to use the latest NLP transformer on your data set or you got a fancy visualization to display your data in a new light. If you can’t communicate the value, these techniques won’t do much aside from providing practice and potential future experience. Photo by Sharon McCutcheon on UnsplashIn this article, I will focus on written communication. I will illustrate the need for written communication with you through 12 examples from my own data science career.
Applying a clustering algorithm to feature contribution
       
It's important to remember that these clusters won’t be directly tied to the feature values, but rather tied to the contributions of these feature values. A useful exercise is to see the per-group average difference between target values and prediction values. After we do this we should have a table that looks roughly like this:We’ll be feeding this dataset to a KMeans algorithm in order to identify trends or clusters within our prediction contributions. Null values: While a feature value may be null, a feature contribution value will not be. However, let's remember that the centers for the values clusters here are based on the feature contributions, not on the feature values.
3 Things I Did to Become a Data Scientist
       
Table of ContentsIntroduction Master Popular Machine Learning Algorithms Perform An End-to-End Case Study Use Data Analytics to Master Data Processing Summary ReferencesIntroductionThis article is for people who are currently Data Analysts and want to make the career change to Data Science. While there is the material that commonly recommends graduate school, online courses, and tutorials, I wanted to focus on the more specific and unique things you can do to transition to Data Science, from someone who was a Data Analyst first and is now a professional Data Scientist. Keep on reading if you would like to know three things that you can do to become a Data Scientist. As you can see, being a Data Analyst now, will allow you certain advantages when pursuing Data Science. Please feel free to comment down below if you have leveraged any of your Data Analytics skills when becoming a Data Scientist — and which ones ?.
Can we explain AI? An Introduction to Explainable Artificial Intelligence.
       
An Introduction to Explainable Artificial Intelligence. Explained AI, interpretable AI, or transparent AI refer to artificial intelligence (AI) techniques that humans can trust and easily understand. Let us look at the point where we refuse to use artificial learning technology because we can not explain how artificial intelligence makes its decision. The research group at the AI Institute at the University of South Carolina is interested in developing an Artificial Intelligence that explains its results. Photo by Possessed Photography on UnsplashKey points to rememberExplained AI, interpretable AI, or transparent AI refer to artificial intelligence (AI) techniques that humans can trust and easily understand.
Data Science for Good: A New Type of Datathon
       
Data Science for Good: A New Type of DatathonPhoto by sophia valkova on UnsplashThe vast majority of information on Data Science is most likely associated with the private sector, like tech companies, and programming skills to address corporate goals. Similar Data Science techniques can help gain real-time insights into people’s lives and wellbeing and target to aid interventions to vulnerable groups. Overall, there is a massive opportunity to use Data Science for good. The World Data League (WDL) is a recently formed team that works in the data field and wants to see the real impact of data science on social problems. The World Data League brings a fresh perspective to data professionals by launching an exciting competition to drive global change.
How do ReLU Neural Networks approximate any continuous function?
       
Neural Networks are used ubiquitously in Machine Learning because they can approximate a wide variety of functions — all Continuous Piecewise Linear Functions (CPWLs), given enough hidden units. ReLU is a piecewise linear function that is 0 for all negative values of x and equal to x otherwise. Modifying ReLU (Desmos)In a neural network, there are several activations (ReLUs) aggregated to approximate a function. Consider a continuous piecewise linear function as defined in the image. ReLu functions exactly represent the piecewise linear function (Desmos)This can be generalized to any continuous piecewise linear function, and a neural network can be used to represent it.
Pandas May Not Be the King of the Jungle After All
       
Pandas May Not Be the King of the Jungle After AllPhoto by Francesco De Tommaso on UnsplashPandas is dominating the data analysis and manipulation tasks with small-to-medium sized data in tabular form. It is arguably the most popular library in the data science ecosystem. I’m a big fan of Pandas and have been using it since I started my data science journey. I’m still debating if I should change my default choice of data analysis library from Pandas to data.table. #Pandasimport numpy as npimport pandas as pd melb = pd.read_csv("/content/melb_data.csv",usecols = ['Price','Landsize','Distance','Type', 'Regionname']) #data.tablelibrary(data.table) melb <- fread("~/Downloads/melb_data.csv", select=c('Price','Landsize','Distance','Type', 'Regionname'))
How to deploy Machine Learning models as a Microservice using FastAPI
       
Microservice imlementation using FastAPI | Ashutosh Tripathi | Data Science DuniyaAs of today, FastAPI is the most popular web framework for building microservices with python 3.6+ versions. By deploying machine learning models as microservice-based architecture, we make code components re-usable, highly maintained, ease of testing, and of-course the quick response time. In this post, the objective is to explain the machine learning model deployment as microservices with the help of FastAPI. Create API using FastAPI frameworkStart from scratch so that you don’t get any error:Open VS code or any other editor of your choice. create a FastAPI "instance" and assign it to app"instance" and assign it to app Here the app variable will be an "instance" of the class FastAPI .
The Future of AI is Decentralized
       
The Future of AI is DecentralizedPhoto by Yuyeung Lau on UnsplashArtificial intelligence is a commodity. Today, research on AI compounds every year. Finally, today’s AI research field is inherently non-collaborative. If this trend continues, eventually the only labs that can conduct AI research will be those with massive budgets. The future of artificial intelligence (AI) will be determined by how much weight we put into collaboration.⁴ — John Suit (CTO) at Koda Inc.Photo by NASA on UnsplashHumans have always collaborated together to bring about the technological marvels of the world.
Why Neural Networks Have Activation Functions
       
Why Neural Networks Have Activation FunctionsImage by AuthorOnce in an interview, I was asked “Why do neural networks have activation functions?” At that moment I had a strong conviction that neural networks without activation functions were just linear models because I had read it in a book somewhere, but I wasn’t sure why that was true. A Toy Neural NetworkSuppose we had a simple neural network that had two inputs, a single hidden layer with two nodes (with linear activation functions), and a single output. XOR Linear vs XOR NonlinearAs a final example, one that I feel is the quickest to illustrate the differences between neural networks with all linear activation functions and neural networks with nonlinear activation functions, let’s examine how different neural networks perform on XOr. Image by AuthorAs expected, our neural network with all linear activation functions performs exactly like a linear regression model. In this article, we learned that feedforward neural networks (whether containing recurrent layers, convolutional layers, or even pooling layers) without any nonlinear activation functions, are just linear models.
Google’s RFA: Approximating Softmax Attention Mechanism in Transformers
       
For example, “Attention” and “Mechanism” should be linked together while neither of them should be heavily linked to “actually” nor “is”. a word embedding matrix. Note: Word embedding is a vector representation of a word that catpures different attributes of that word. It will then initialise three weight matrices for query W_q , key W_k , and value W_v . As outlined in the paper Attention is all your need, an initial attention matrix can then be calculated as follows:Source: ‘Attention Is All You Need’ by Vaswani et al.
A.I writes Taylor Swift songs ft. GPT-2
       
This article will explore how to use GPT-2 to generate lyrics based on all her released songs until February 2021. Without further ado, these are the steps used to generate new Taylor Swift lyrics with GPT-2. I mean it, I'm just gonna shake it off, I'm just gonna shake it off. Well, maybe you and I are just a couple of the many nicknamesI'm afraid I'm gettingJust know I'm not letting up until the day I die2. Final RemarksAnd these were all the steps I took to train an A.I model to generate Taylor Swift lyrics.
Classifying emotions using audio recordings and Python
       
To accomplish that, we have our Vocal Tract. DB = librosa.amplitude_to_db(D, ref=np.max)librosa.display.specshow(DB, sr=sampling_rate, x_axis='time', y_axis='log');plt.colorbar(format='%+2.0f db')plt.show()The SpectrogramThis result is much more informative as we can see the decibels of the different frequencies over time. As we mentioned earlier, speech could be described as a combination of the Vocal Tract and the Glottal Pulse. To simplify things- we needed to extract the Vocal Tract from the speech without the Glottal Pulse. This Spectral envelope is the equivalent of what we considered as the Vocal Tract, and its maximum points are called ‘Formants’.
Translate and analyze text using SQL functions with Amazon Athena, Amazon Translate, and Amazon Comprehend
       
You already know how to use Amazon Athena to transform data in Amazon S3 using simple SQL commands and the built-in functions in Athena. Now you can also use Athena to translate and analyze text fields, thanks to Amazon Translate, Amazon Comprehend, and the power of Athena User Defined Functions (UDFs). Athena is an interactive query service that makes it easy to analyze data stored in Amazon S3 using SQL. Optimizing costIn addition to Athena query costs, the text analytics UDF incurs usage costs from Lambda and Amazon Comprehend and Amazon Translate. ConclusionI have shown you how to install the sample text analytics UDF Lambda function for Athena, so that you can use simple SQL queries to translate text using Amazon Translate, generate insights from text using Amazon Comprehend, and redact sensitive information.
Annotator for Object Detection
       
Annotator for Object DetectionIn computer vision, object detection is an interesting field with numerous application. But it was hard for me to find a proper tool that would do everything for me so I can focus on the main problem, Object Detection. So I present, The Object Annotation Maker. Preparing Object ImagesIt is rather easy to find images of objects than annotations. This is because capturing object images is easy and less labour.
Deploying your ML model using Streamlit and Ngrok
       
Deploying your ML model using Streamlit and NgrokPhoto by Florian Olivo on UnsplashIf you are someone who has been working with data, chances are you have made a machine learning model at least once. In this blog, we will walk through the complete process of creating a model and deploying it on to the web. We will use this pickle file in our new jupyter notebook in which we will be using streamlit and ngrok to deploy our model on to the web. The next step is to make our model using Linear Regression. In order to deploy our model, we will be using Streamlit and Ngrok.
What You Should Know Before Becoming a Marketing Data Analyst
       
Unfortunately, this also means the data requests come off the top of their head as new ideas form. Marketing will still ask you to calculate ROI with imperfect data and often times it’ll be hard to definitively say that marketing had an impact. Domain knowledgeIf you don’t have domain knowledge in marketing, make sure you read up on common marketing concepts and terminology. This knowledge is helpful to guide marketing because they may not know the right data questions to ask. Even though there are challenges working as a marketing data analyst it’s a worthwhile experience and I hope this has not discouraged you from giving marketing analytics a try.
How to launch a GPU instance on Google Cloud — Fast
       
Before you launch a GPU instance, you need to make sure that your quota is set to 1 or however many GPUs you want to use. First, select ‘Limit Name’ and then select ‘GPUs (all regions)’. Something important to keep in mind is the amount of GPU memory your project will need. This chart shows how much GPU memory each type of GPU has. In order to start the instance, select the notebook you want to use and start.
Automated Feature Engineering Using Neural Networks
       
This is so that we can train a separate feature model around the target feature that will also feed into the final model as well. You should also be careful to remove features that can directly be used to calculate the output feature. To fix this, we will exclude the other gender, age, and contact features from the corresponding feature models. For each feature model, we will create the DenseFeatures input layer (excluding the features defined above) and create a separate model using the add_model function. If so, we append the input features so that the final model can train using the original features as well.
Improving UI Layout Understanding with Hierarchical Positional Encodings
       
The relationships between elements are provided by the RICO dataset in JSON files, which tell us the ancestors and children of each UI element. —For our exploration of layout understanding models, we used the aforementioned RICO dataset, consisting of the visual, textual, structural, and interactive design properties of more than 66k unique UI screens. The example below shows the UI with maximum number of nodes on any given level across the RICO dataset: 421. The example below shows the UI with maximum number of nodes on any given level across the RICO dataset — 421 in total. We can also filter elements from the RICO dataset by restricting the depth of the hierarchy.
Build your own Grammarly in Python
       
Build your own Grammarly in PythonImage by Lorenzo Cafaro from PixabayUsage of good grammar and correctly spelled words helps you to write and communicate clearly and get what you want. Often while typing emails, essays, articles, etc one makes a lot of grammatical and spelling mistakes. GingerIt:Image by Pexels from PixabayGingerIt is an open-sourced Python package that is a wrapper around gingersoftware.com API. The results of the gingerit package are not up to the mark, as it works correcting spelling mistakes and correcting minor grammatical mistakes. Gingerit is a wrapper around gingersoftware.com API, which is a paid version, that may give good results in correcting grammatical mistakes.
Understanding Feature Importance and How to Implement it in Python
       
The Math Behind Feature ImportanceThere are different ways to calculate feature importance, but this article will focus on only two methods: Gini importance and Permutation feature importance. With this, you can get a better grasp of the feature importance in random forests. Permutation Feature ImportanceThe idea behind permutation feature importance is simple. The feature importance is calculated by noticing the increase or decrease in error when we permute the values of a feature. The permutation feature importance is based on an algorithm that works as follows.
Why 0.9? Towards Better Momentum Strategies in Deep Learning.
       
In deep learning, most practitioners set the value of momentum to 0.9 without attempting to further tune this hyperparameter (i.e., this is the default value for momentum in many popular deep learning packages). Overall, we aim to demonstrate through this post that significant benefit can be gained by developing better strategies for handling the momentum parameter within deep learning. Stochastic gradient descent with momentum (SGDM) is a widely-used tool for deep learning optimization. Therefore, even the best approaches for deep learning optimization are flawed — no single approach for training deep models is always optimal. Through this post, we hope to solve this issue and make momentum decay a well-known option for deep learning optimization.
Evolution Strategies From Scratch in Python
       
# ackley multimodal function from numpy import arange from numpy import exp from numpy import sqrt from numpy import cos from numpy import e from numpy import pi from numpy import meshgrid from matplotlib import pyplot from mpl_toolkits.mplot3d import Axes3D # objective function def objective(x, y): return -20.0 * exp(-0.2 * sqrt(0.5 * (x**2 + y**2))) - exp(0.5 * (cos(2 * pi * x) + cos(2 * pi * y))) + e + 20 # define range for input r_min, r_max = -5.0, 5.0 # sample input range uniformly at 0.1 increments xaxis = arange(r_min, r_max, 0.1) yaxis = arange(r_min, r_max, 0.1) # create a mesh from the axis x, y = meshgrid(xaxis, yaxis) # compute targets results = objective(x, y) # create a surface plot with the jet color scheme figure = pyplot.figure() axis = figure.gca(projection='3d') axis.plot_surface(x, y, results, cmap='jet') # show the plot pyplot.show() 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 # ackley multimodal function from numpy import arange from numpy import exp from numpy import sqrt from numpy import cos from numpy import e from numpy import pi from numpy import meshgrid from matplotlib import pyplot from mpl_toolkits . # initial population population = list ( ) for _ in range ( lam ) : candidate = None while candidate is None or not in_bounds ( candidate , bounds ) : candidate = bounds [ : , 0 ] + rand ( len ( bounds ) ) * ( bounds [ : , 1 ] - bounds [ : , 0 ] ) population . ... # check if this parent is the best solution ever seen if scores[i] < best_eval: best, best_eval = population[i], scores[i] print('%d, Best: f(%s) = %.5f' % (epoch, best, best_eval)) 1 2 3 4 5 . ... # replace population with children population = children 1 2 3 . append ( child ) # replace population with children population = children return [ best , best_eval ]Next, we can apply this algorithm to our Ackley objective function.
Google AI Blog: Lyra: A New Very Low-Bitrate Codec for Speech Compression
       
Even though video might seem much more bandwidth hungry than audio, modern video codecs can reach lower bitrates than some high-quality speech codecs used today. Combining low-bitrate video and speech codecs can deliver a high-quality video call experience even in low-bandwidth networks. Yet historically, the lower the bitrate for an audio codec, the less intelligible and more robotic the voice signal becomes. To solve this problem, we have created Lyra, a high-quality, very low-bitrate speech codec that makes voice communication available even on the slowest networks. Lyra OverviewThe basic architecture of the Lyra codec is quite simple.
Setting up Amazon Personalize with AWS Glue
       
Crawling your data with AWS GlueWe use AWS Glue to crawl through the JSON file to determine the schema of your data and create a metadata table in your AWS Glue Data Catalog. Using AWS Glue to convert your files from CSV to JSONAfter your crawler finishes running, go to the Tables page on the AWS Glue console. On the AWS Glue Dashboard, choose AWS Glue Studio. AWS Glue Studio is an easy-to-use graphical interface for creating, running, and monitoring AWS Glue ETL jobs. To learn more about Amazon Personalize scores, see Introducing recommendation scores in Amazon Personalize.
Amazon Rekognition Custom Labels Community Showcase
       
We worked with AWS Machine Learning (ML) Heroes and AWS ML Community Builders to bring to life projects and use cases that detect custom objects with Amazon Rekognition Custom Labels. Amazon Rekognition Custom Labels allows you to detect custom labeled objects and scenes with zero Jupyter notebook experience. AWS ML Heroes and AWS ML Community BuildersClassify LEGO bricks with Amazon Rekognition Custom Labels by Mike Chambers. Using Amazon SageMaker and Amazon Rekognition Custom Labels to automate detection by Luca Bianchi. Learn how to detect clean and dirty HVACs using Amazon Rekognition Custom Labels and Amazon SageMaker from AWS ML Hero Luca Bianchi.
Women in Data Science (WiDS) Datathon on Kaggle
       
Women in Data Science (WiDS) Datathon on Kaggle4th Annual WiDS Datathon focuses on social impact, namely on patient health, with an emphasis on the chronic condition of diabetes. The competition is organized by WiDS Worldwide team at Stanford, West Big Data Innovation Hub, WiDS Datathon Committee and launched on Kaggle. You should build a model for Diabetes Mellitus prediction, using the data gathered during the first 24 hours of patient’s intensive care. Data explorationI start data exploration from DataDictionaryWiDS2021.csv file, with a detailed description of all features. It does not matter if you are a novice or a veteran of data science, you can find several tutorials from the WiDS Datathon Committee to help you get started.
Training a Multi-Label Emotion Classifier with Tez and PyTorch
       
---3 — Define a PyTorch Dataset ?This doesn’t change: you’ll still have to define how your data will be loaded and preprocessed. Image by the authorAfter training on 8 epochs the model reached 0.97 AUC on train data and 0.95 on validation data. Let’s now see how much the model scores on test data and evaluate its performance in each class. Test the model on some free textLet’s test the model on some free text. After a few lines of code, the trained model reached an amazing 0.95 AUC score on test data.
Achieving human-level performance in QWOP using Reinforcement Learning and Imitation Learning
       
EnvironmentThe first step was to connect the QWOP to the AI agent so that it can interact with the game. The action space consisted of 11 possible actions: each of the 4 QWOP buttons, 6 two-button combinations, and no keypress. This Reinforcement Learning algorithm combines Advantage Actor-Critic (A2C) with a replay buffer for both on-policy and off-policy learning. Illustrative: reinforcement learning agent neural network architecture. Recoded from QWOP.
Train “Undying” Flappy Bird Using Reinforcement Learning on Java
       
Train “Undying” Flappy Bird Using Reinforcement Learning on JavaImage by OpenClipart-Vectors from PixabayFlappy Bird is a mobile game that was introduced in 2013 which became super popular because of its simple way to play (flap/no-flap). With the growth of Deep Learning (DL) and Reinforcement Learning (RL), we can now train an AI agent to control the Flappy Bird actions. Today, we will look at the process to create an AI agent using Java. For training, we used Deep Java Library (DJL), a deep learning framework based on Java, to build the training network and RL algorithm. This project used a similar approach with DeepLearningFlappyBird, a Python Flappy Bird RL implementation.
What leaders need to know about AI-driven growth
       
What leaders need to know about AI-driven growthMaking AI impactful and scalable is hardIn virtually every industry, companies invest heavily in AI. This is the basis of a feedback loop we like to call “the virtuous cycle of AI-driven growth”. Source: Image by AuthorUnderstanding the virtuous cycle of AI/ML-driven growthFirst, let’s establish that successfully deployed AI is an engine for growth. Integrating observability and governance therefore connects the dots and completes your AI/ML growth cycle. In our work, we have talked with hundreds of AI teams, and have partnered with dozens of these teams.
Best Companies to Work for as a Data Scientist
       
Best Companies to Work for as a Data ScientistPhoto by Jordan Whitfield on UnsplashAre you either a Data Scientist actively seeking a job, or are you switching careers to Data Science and wants to know what is out there? If so, this article will give you some insights on where to work as a Data Scientist beyond the traditional tech companies. For this reason, here is a list of tech companies that have the potential to provide a unique Data Science career. ConclusionAs the demand for Data Scientists continues to grow, more companies rely on data to improve their business, products, and service. There is a wide range of companies for Data Scientists, from cybersecurity and customer relationship to open-source AI.
Improving Models Using Missing Values
       
Removing samples (i.e., rows) with missing values forces us to ignore other non-missing values that could help us build better predictive models. It makes sense to have missing values for people without cancer, but why do we have many missing values for people who finally diagnosed with cancer? Note that some women might leave those questions unchecked, but the meaning of missing values for males is different from missing values for female patients. Sometimes, the correlation between missing values and ID numbers can help us understand both missing values as well as the structure behind those codes and numbers. The other danger of removing samples with missing values (before investigating the reasons behind them) is introducing bias to our models or studies.
How to use PyCaret - the library for easy ML
       
PyCaret is a high-level, low-code Python library that makes it easy to compare, train, evaluate, tune, and deploy machine learning models with only a few lines of code. Yes, you could use these libraries for the same tasks, but if you don’t want to write a lot of code, PyCaret could save you a lot of time. This function uses a library for explainable machine learning called SHAP that I covered in the article below. MLflow UIAnother nice feature of PyCaret is that it can log and track your machine learning experiments with a machine learning lifecycle tool called MLfLow. You can’t perform more complex machine learning tasks such as image classification and text generation with PyCaret (at least with version 2.2.0).
Deep Learning with R and Keras: Build a Handwritten Digit Classifier in 10 Minutes
       
Installing Tensorflow and Keras with RTo build an image classifier model with Keras, you’ll have to install the library first. But before you can install Keras, you’ll have to install Tensorflow. The procedure is a bit different than when installing other libraries. Yes, you’ll still use the install.packages() function, but there’s an extra step involved. Here’s how to install Tensorflow from the R console:install.packages("tensorflow")library(tensorflow)install_tensorflow()Most likely, you’ll be prompted to install Miniconda, which is something you should do — assuming you don’t have it already.
Why the FDA Regulating Medical AI Products Would Be Good for Everyone
       
This is true of classical medical methods before machine learning, and AI has the potential to improve or reduce medical biases. Widespread UseWithout regulation providing a framework to create trust in efficacy and safety, use of medical AI products is limited. Concerns over collecting patient data seem to be increasing as well, with cases like Ascension’s alleged sale of identifiable patient data to Google. StabilityIn the Trump administration’s last days, it filed a proposal to exempt permanently many medical AI categories from FDA review. While adaptability is key in this new domain, we should have an established, stable standard for AI medical products–just like we do with non-AI medical products.
Edge AI: Embedded Hardware Considerations
       
Edge AI: Embedded Hardware ConsiderationsFig: Edge AI deployment pipeline (Image by Author)IntroductionAI/ML use-cases are pervasive. Edge AI aims to bring all the goodness of AI to the device. Edge AI enables Visual, Location and Analytical solutions at the edge for diverse industries, such as Healthcare, Automotive, Manufacturing, Retail and Energy. Similarly, a report by 360 Research Reports estimates that “the global Edge AI Software market size will reach US$ 1087.7 million by 2024”. The accompanying Nvidia Jetson AGX Xavier Developer Kit provides tools and libraries for development of Edge AI applications.
A Comprehensive List Of Proven Techniques To Address Data Scarcity In Your AI Journey
       
Background Photo by Jan Antonin Kolar on UnsplashMachine Learning, Deep Learning, Data ScienceA Comprehensive List Of Proven Techniques To Address Data Scarcity In Your AI JourneyIn general, we know that Machine Learning and especially Deep Learning require BIG data to work well. Also, apart from FAANG (Facebook, Apple, Amazon, Netflix, and Google), most other companies do not have access to that kind of BIG data. Similarly, ML models overfit the training data, especially the complex models like Neural Networks, by memorizing small training data without learning underlying patterns. Small Data No Data Rare Data Costly Data Imbalanced Data1. For image data, data augmentation can be done byModifying lighting conditionsRandom croppingHorizontal flippingApplying transformations like translation, rotation, or shearingZoom-in, zoom-outImage Data Augmentation (Image by author)You can also apply data augmentation to the text data byBack translationSynonym replacementRandom insertion, swap, and deletionBack Translation — Text Data Augmentation (Image by author)In addition to these, there are techniques to learn optimal data augmentation policy for your dataset, referred to as AutoAugment.
Self-Supervised Policy Adaptation during Deployment
       
Self-Supervised Policy Adaptation during DeploymentOur method learns a task in a fixed, simulated environment and quickly adapts to new environments (e.g. Assuming that gradients of the self-supervised objective are sufficiently correlated with those of the RL objective, any adaptation in the self-supervised task may also influence and correct errors in the perception and decision-making of the policy. SAC+IDM is a Soft Actor-Critic (SAC) policy trained with an Inverse Dynamics Model (IDM), and SAC+IDM (PAD) is the same policy but with the addition of policy adaptation during deployment on the robot. Policy adaptation is especially effective when the test environment differs from the training environment in multiple ways, e.g. CURL is a contrastive method, SAC+IDM is a Soft Actor-Critic (SAC) policy trained with an Inverse Dynamics Model (IDM), and SAC+IDM (PAD) is the same policy but with the addition of policy adaptation during deployment.
Process documents containing handwritten tabular content using Amazon Textract and Amazon A2I
       
Set up an Amazon A2I human loop to review and modify the Amazon Textract response. Use the Amazon Textract Parser Library to process the responseWe will now import the Amazon Textract Response Parser library to parse and extract what we need from Amazon Textract’s response. client = boto3.client( service_name='textract', region_name= 'us-east-1', endpoint_url='https://textract.us-east-1.amazonaws.com', ) with open(documentName, 'rb') as file: img_test = file.read() bytes_test = bytearray(img_test) print('Image loaded', documentName) # process using image bytes response = client.analyze_document(Document={'Bytes': bytes_test}, FeatureTypes=['TABLES','FORMS'])You can use the Amazon Textract Response Parser library to easily parse JSON returned by Amazon Textract. When text analysis is finished, Amazon Textract publishes a completion status to the Amazon Simple Notification Service (Amazon SNS) topic that you specify in NotificationChannel . Sending predictions to Amazon A2I human loopsWe create an item list from the Pandas DataFrame where we have the Amazon Textract output saved.
Using container images to run TensorFlow models in AWS Lambda
       
You can package your code and dependencies as a container image using tools such as the Docker CLI. For more information about handlers for Lambda, see AWS Lambda function handler in Python. The base images are preloaded with language runtimes and other components required to run a container image on Lambda. Creating the Lambda function with the TensorFlow codeTo create your Lambda function, complete the following steps:On the Lambda console, choose Functions. You can bring your custom models and deploy them on Lambda using up to 10 GB for the container image size.
Top 5 Layers You Can Always Come Across in Any Convolutional Neural Network
       
Dropout LayerThe dropout layer is not a mandatory layer in a convolutional neural network architecture. However, due to the frequent occurrence of overfitting, we often come across one or more dropout layers in convolutional neural networks. Thanks to dropout layers, a convolutional neural network can learn to adapt to optimizing the model weights in a more flexible way, which prevents overfitting. Flatten LayerAs we pointed out earlier, every convolutional neural network consists of a convolution network and a fully connected layer. The Flatten Layer (Figure by Author)Fully Connected Layer or NetworkA fully connected layer is the last part of a convolutional neural network.
Here’s your Personal Telegram Bot to Learn New Language
       
Here’s your Personal Telegram Bot to Learn New LanguageIt used to be difficult to learn a new language 10–20 years ago. We shall now see how to set up the bot in your local environment, as well as on the cloud. Telegram BotTo create your Telegram bot, just visit BotFather (details are here). Enable APIsEnable the 3 Google Cloud APIs (Text-to-Speech, Speech-to-Text, Translate) after logging in to your Google Cloud consoleScreenshots of our 3 musketeers2. ?Google Cloud SetupUnless you plan to switch on your computer 24/7, you would want to host your bot in the cloud with 3 simple steps1.
Top 3 Statistical Paradoxes in Data Science
       
Observation bias and sub-group differences can easily produce statistical paradoxes in any data science application. In this article we look at the 3 most common kinds of statistical paradoxes encountered in Data Science. Berkson’s ParadoxA first striking example is the observed negative association between COVID-19 severity and smoking cigarettes (see e.g. Latent Variable Paradox: “fire severity” is a latent variable for both “n of firefighters deployed” and “n of injured”. ConclusionsLatent variables, collider variables, and class imbalance can easily produce statistical paradoxes in many data science applications.
Practical Cython — Music Retrieval: Detrended Fluctuation Analysis (DFA)
       
7: initialize and convert all the data to memory views, so they can be read in CThe main C function dfa is then called and memory views are passed as &memoryview[0] in order to retrieve all the final values from the C code. 8: import the necessary libraries and defines constants and essential function for DFAAfter these essential imports, the main dfa function can be defined, thinking of 3 principal chunks. 9: First step of the dfa function. 10: Second step in the dfa function. The very last lines of the DFA C code are pretty straightforward and computes the log2 transform of the RMS, in order to return the Hurst exponent:Fig.
Feature Stores need an HTAP Database
       
A complete Feature Store keeps history of feature values and uses it to create time-accurate training data sets. A feature store in production that keeps feature values up to date, enables ML models to be quickly moved into production. We use blue to indicate high volume data processing found in massively parallel processing database engines usually referred to as OLAP. The joins span multiple different feature groups and accurately bind each training case with the feature values that correspond to the time of each event. Splice Machine’s Feature Store and its in-database model deployment provide a perfect combination to deliver both real-time and batch inference directly on the Feature Store.
How to Baseline Deep Learning Tasks in a Flash
       
PyTorch Lightning Flash is a new library from the creators of PyTorch Lightning to enable quick baselining of state-of-the-art Deep Learning tasks on new datasets in a matter of minutes. Flash is built on top of PyTorch Lightning to abstract away the unnecessary boilerplate for common Deep Learning Tasks ideal for:Data scienceKaggle CompetitionsIndustrial AIApplied researchAs such, Flash provides seamless support for distributed training and inference of Deep Learning models. Creating your first Deep Learning Baseline with FlashPhoto by Lalesh Aldarwish from PexelsAll code for the following tutorial can be found in the Flash Repo under Notebooks. Choose a Deep Learning Task Load Data Pick a State of the Art Model Fine-tune the Task PredictNow let’s Get Started!!! Step 1: Choose A Deep Learning TaskPhoto by Pixabay from PexelsThe first step of the applied Deep Learning Process is to choose the task we want to solve.
What We Love About Prefect
       
Prefect is built on top of Dask, and they share some core contributors, so we were confident in Prefect from the start. This means setting up our own Prefect server on AWS and not depending on Prefect Cloud. The only hurdles we encountered were related to certain Prefect Core components which assume you’re using Prefect Cloud (the proprietary, enterprise part of Prefect). Because we’re running our own Prefect Server (instead of Prefect Cloud), we have zero dependencies on Prefect as a third-party service. Within a couple of days, Prefect also merged and deployed our improvements to their documentation on deploying Prefect with HELM.
Managing Annotation Mistakes with FiftyOne and Labelbox
       
Managing Annotation Mistakes with FiftyOne and LabelboxPhoto by Johnny McClung on UnsplashAs computer vision datasets are growing to contain millions of images, annotation errors undoubtedly creep in. FiftyOne then directly integrates with the annotation tool Labelbox to letting us easily re-annotate problematic samples. SetupBoth FiftyOne and Labelbox APIs are installable through pip :pip install fiftyonepip install labelboxUsing Labelbox first requires you to create a free account. This gives you access to the Labelbox annotation app and lets you upload raw data. Case 1: Dataset is in LabelboxIf you have a dataset you are annotating in Labelbox and just want to use FiftyOne to explore the dataset and find annotation mistakes, then the Labelbox integrations provided in FiftyOne make this fast and easy.
The Three Decoding Methods For NLP
       
The Three Decoding Methods For NLPPhoto by Jesse Collins on UnsplashOne of the often-overlooked parts of sequence generation in natural language processing (NLP) is how we select our output tokens — otherwise known as decoding. You may be thinking — we select a token/word/character based on the probability of each token assigned by our model. When we are selecting a token in machine-generated text, we have a few alternative methods for performing this decode — and options for modifying the exact behavior too. In this article we will explore three different methods for selecting our output token, these are:> Greedy Decoding > Random Sampling > Beam SearchIt’s pretty important to understand how each of these works — often-times in language applications, the solution to a poor output can be a simple switch between these four methods. If you’d prefer video, I cover all three methods here too:I’ve included a notebook for testing each of these methods with GPT-2 here.
Paper’s maths explained. There are plenty of new papers and it…
       
Paper’s maths explainedThere are plenty of new papers and it is often difficult to understand the math behind them. Interest of having efficient translation systemsEfficient language translation is crucial for several reasons:- It is widely used, even in real time (eg. Here are the main improvements:Use of recent recurrent solutions: Simpler Simple Recurrent Units (SSRU) and Average Attention Networks (AAN). Translation’s neural networks can be compared to a stream water having different possible states in each position and sequentially connected to each other. Average Attention NetworksAverage Attention Networks (AAN) calculates a cumulative-average operation to generate context-sensitive representation for each input embedding.
PyTorch ELMo, trained from scratch
       
In ELMo, we are only using the character-level uncontextualized embeddings. Internally, it converts the character indices into character embeddings with a learning embedding matrix. Doing so gives us an uncontextualized word embedding that includes subword information. Uncontextualized word embeddings have been around for quite a while. With uncontextualized embeddings, the embeddings of “bank” are the same in “river bank” and “bank account”.
Sensitivity Analysis of Dataset Size vs. Model Performance
       
In this tutorial, you will discover how to perform a sensitivity analysis of dataset size vs. model performance. Sensitivity analysis provides an approach to quantifying the relationship between model performance and dataset size for a given model and prediction problem. # evaluate a model def evaluate_model(X, y): # define model evaluation procedure cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1) # define model model = DecisionTreeClassifier() # evaluate model scores = cross_val_score(model, X, y, scoring='accuracy', cv=cv, n_jobs=-1) # return summary stats return [scores.mean(), scores.std()] 1 2 3 4 5 6 7 8 9 10 # evaluate a model def evaluate_model ( X , y ) : # define model evaluation procedure cv = RepeatedStratifiedKFold ( n_splits = 10 , n_repeats = 3 , random_state = 1 ) # define model model = DecisionTreeClassifier ( ) # evaluate model scores = cross_val_score ( model , X , y , scoring = 'accuracy' , cv = cv , n_jobs = - 1 ) # return summary stats return [ scores . TutorialsAPIsArticlesSummaryIn this tutorial, you discovered how to perform a sensitivity analysis of dataset size vs. model performance. Sensitivity analysis provides an approach to quantifying the relationship between model performance and dataset size for a given model and prediction problem.
Google AI Blog: The Technology Behind Cinematic Photos
       
Camera 3D model courtesy of Rick Reitano. So, we created our own dataset for training the monocular depth model using photos captured on a custom 5-camera rig as well as another dataset of Portrait photos captured on Pixel 4. We want the creation to include as much of the salient regions as possible when framing the virtual camera. Now that you know how they are created, keep an eye open for automatically created Cinematic photos that may appear in your recent memories within the Google Photos app! AcknowledgmentsCinematic Photos is the result of a collaboration between Google Research and Google Photos teams.
FAIR unveils U.K. PhD program in partnership with UCL
       
That’s why one of our missions at Facebook AI Research (FAIR) is to foster collaboration in the field by open-sourcing our work and sharing our data, as well as by supporting fundamental and applied AI research taking place in academic institutions around the world. Today, we are announcing a four-year AI research partnership with University College London (UCL) as part of the expansion of our PhD program to the U.K. Our PhD programs support PhD students undertaking cutting-edge AI research at leading universities. UCL is a natural partner for our PhD program: the university’s Computer Science Department has been recognized by the Research Excellence Framework as a top-ranking institution engaging in high-quality research in the U.K. Several of our FAIR London researchers, including Tim Rocktäschel, Edward Grefenstette, and Sebastian Riedel, also have dual affiliation at UCL. Our FAIR London site is already host to several PhD students, including Patrick Lewis, a third-year PhD student working on teaching machines to answer any natural language question.
Architect and build the full machine learning lifecycle with AWS: An end-to-end Amazon SageMaker demo
       
The output of SageMaker Data Wrangler is data transformation code that works with SageMaker Processing, SageMaker Pipelines, SageMaker Feature Store, or with Pandas in a plain Python script. The output of SageMaker Data Wrangler is data transformation code that works with SageMaker Processing, SageMaker Pipelines, SageMaker Feature Store, or with Pandas in a plain Python script. Detecting bias – With SageMaker Clarify, in the data prep or training phases, we can detect pre-training (data bias) and post-training bias (model bias). To build this pipeline, we will prepare some data (customers and claims) by ingesting the data into SageMaker Data Wrangler and apply various transformations in SageMaker Data Wrangler within SageMaker Studio. See the following code:Data columns (total 31 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 policy_id 5000 non-null int64 1 incident_severity 5000 non-null float64 2 num_vehicles_involved 5000 non-null int64 3 num_injuries 5000 non-null int64 4 num_witnesses 5000 non-null int64 5 police_report_available 5000 non-null float64 6 injury_claim 5000 non-null int64 7 vehicle_claim 5000 non-null int64 8 total_claim_amount 5000 non-null int64 9 incident_month 5000 non-null int64 10 incident_day 5000 non-null int64 11 incident_dow 5000 non-null int64 12 incident_hour 5000 non-null int64 13 fraud 5000 non-null int64 14 driver_relationship_self 5000 non-null float64 15 driver_relationship_na 5000 non-null float64 16 driver_relationship_spouse 5000 non-null float64 17 driver_relationship_child 5000 non-null float64 18 driver_relationship_other 5000 non-null float64 19 incident_type_collision 5000 non-null float64 20 incident_type_breakin 5000 non-null float64 21 incident_type_theft 5000 non-null float64 22 collision_type_front 5000 non-null float64 23 collision_type_rear 5000 non-null float64 24 collision_type_side 5000 non-null float64 25 collision_type_na 5000 non-null float64 26 authorities_contacted_police 5000 non-null float64 27 authorities_contacted_none 500
Applying Bayesian Networks to Covid-19 Diagnosis
       
In our example, the node in the top layer, Bordetella pertussis, doesn’t have any parents; Adenovirus’s parent node is Bordetella pertussis, while its children are CoronavirusNL63 and Strepto A; and in turn Strepto A would not have any children. For the network depicted above, our JPD would be:P (Bordetella pertussis, Adenovirus, CoronavirusNL63, Coronavirus HKU1, … , Parainfluenza 1, Chlamydophilia pneumoniae) = P (Bordetella pertussis) x P (Adenovirus | Bordetella pertussis) x P (CoronavirusNL63 | Adenovirus, Bordetella pertussis) x P (Coronavirus HKU1 | CoronavirusNL63, Bordetella pertussis) x … x P (Parainfluenza 1 | Bordetella pertussis, Rhinovirus/Entenovirus) x P (Chlamydophilia pneumoniae | Parainfluenza 1, Bordetella pertussis)It is possible to obtain these CPTs in two different ways: we can either use a frequentist approach by learning a structure from a historical dataset, or we can elicit the CPTs from our domain knowledge. In our case, the Hospital Israelita’s dataset is composed of 5644 observations, each corresponding to a patient that was tested for Covid-19. Out of the remaining variables, discrete variables will be filled with the value -999, while continuous variables will be filled with the median. As we’ve said before, our BN works better with discrete variables, so we need to discretise our continous variables.
Advanced Permutation Importance to Explain Predictions
       
PERMUTATION IMPORTANCE: BASIC USAGEPermutation importance is a frequently used procedure for feature importance computation that every data scientist must know. The permutation importance is defined to be the difference between the permutation metric and the baseline metric. Example of permutation importance for feature evaluation (image by the author)The graph above is very common when computing permutation importance. Permutation importance for feature evaluation in a classification task (image by the author)We retrieve as before the importance score per sample. SUMMARYIn this post, we introduced the basic concepts of permutation importance as a procedure for feature importance computation.
PyTorch Lightning and Optuna: Multi-GPU hyperparameter optimisation
       
PyTorch Lightning and Optuna: Multi-GPU hyperparameter optimisationPhoto by Gustavo Campos on UnsplashProbably most people reading this post has at least once trained an ML model that took a considerable amount of time. For me one of the most appealing features of PyTorch Lightning is a seamless multi-GPU training capability, which requires minimal code modification. PyTorch Lightning is a wrapper on top of PyTorch that aims at standardising routine sections of ML model implementation. Pytorch Lightning provides a base class LightningDataModule to download, prepare and serve a dataset of choice to a model. Another great thing about PyTorch Lightning is that there is no need to specify device(s) when initialising IntelDataModule as everything is done automatically by the trainer later.
[Text-to-SQL] Learning to query tables with natural language
       
One WHERE conditionQ: What is Mark Zuckerberg's salary? SQL: SELECT Base SalaryFROM CompensationWHERE Name = ‘Mark Zuckerberg’To successfully translate “What is Mark Zuckerberg’s salary?” into SQL, the model needs to know that it should look up “Mark Zuckerberg” in the ‘Name’ column. SQL: SELECT NameFROM CompensationWHERE Department = ‘Software Engineering’ AND Base Salary < 100000The model should match “who” with the ‘Name’ column, “software engineering” with the ‘Department’ column and “make … base” with the ‘Base Salary’ column. Q: “What is Mark Zuckerberg's salary?" SQL: SELECT Base SalaryFROM Compensation cJOIN Employees e ON c.EmployeeId = e.EmployeeIdWHERE e.FirstName = ‘Mark' AND e.LastName = 'Zuckerberg’The example table “Compensation” has employees’ names and salary numbers altogether.
7 Real-World Datasets to Learn Everything needed about Machine Learning
       
7 Real-World Datasets to Learn Everything needed about Machine LearningPhoto by Franki Chamaki on UnsplashI’m actively involved in mentoring and coaching people in data science. If you prefer video format, check here — https://www.youtube.com/watch?v=8agqUFZOsO0Housing Price DatasetBoston housing price dataset and Melbourne Housing price dataset are two popular housing price datasets. The Melbourne housing price dataset is much larger and the attributes of this dataset focus on the property rather than the locality. Use-casesPredicting the housing price — After analysis and preprocessing divide the dataset into train and test, then implement a model to predict the housing price. Here is a Kaggle notebook for reference to learn about performing exploratory data analysis using Melbourne housing data.
Visual Guide to the Confusion Matrix
       
True Positive (TP): The predicted class was positive for the virus and the actual class was positive. False Negative (FN): The predicted class was negative and the actual class was positive. True Negative (TN): The predicted class was negative for the virus and the actual class was also negative. Binary-class confusion matrices like this can help guide our decision making when building a classification model. After comparing each predicted class with the associated label, the confusion matrix starts to look similar to the binary-class situation.
Multi-Agent Deep Reinforcement Learning in 13 Lines of Code Using PettingZoo
       
Multi-Agent Deep Reinforcement Learning in 13 Lines of Code Using PettingZooThis tutorial provides a simple introduction to using multi-agent reinforcement learning, assuming a little experience in machine learning and knowledge of Python. Multiagent Reinforcement LearningLearning to play multiplayer games represents many of the most profound achievements of artificial intelligence in our lifetimes. Using reinforcement learning to control multiple agents, unsurprisingly, is referred to as multi-agent reinforcement learning. This can be visualized as follows:Copyright Justin Terry 2021Multi-agent deep reinforcement learning, what we’ll be doing today, similarly just uses deep neural networks to represent the learned policies in multi-agent reinforcement learning. A few years back OpenAI released the “baselines” repository which included implementations of most of the major deep reinforcement learning algorithms.
5 Types of Machine Learning Bias Every Data Scientist Should Know
       
Machine learning bias is a term used to describe when an algorithm produces results that are not correct because of some inaccurate assumptions made during one of the machine learning process steps. To develop any machine learning process, the data scientist needs to go through a set of steps, from collecting the data, cleaning it, training the algorithm, and then deploying it. All subfields of data science, whether it be machine learning, natural language processing, or any other subfield, depend on data. This article will go through the 5 main types of machine learning bias, why they occur, and how to reduce their effect. №2: Sample biasAnother cause for bias in machine learning applications is sample bias.
Audio Deep Learning Made Simple (Part 3): Data Preparation and Augmentation
       
This is the third article in my series on audio deep learning. Both of these are essential aspects of data preparation in order to get better performance from our audio deep learning models. What problems is audio deep learning solving in our daily lives. Mel Spectrogram HyperparametersThis gives us the hyperparameters for tuning our Mel Spectrogram. Augmentation by Adding Noise (Image by Author)ConclusionWe have now seen how we pre-process and prepare audio data for input to deep learning models.
Intro to reinforcement learning: temporal difference learning, SARSA vs. Q-learning
       
Intro to reinforcement learning: temporal difference learning, SARSA vs. Q-learningReinforcement learning (RL) is surely a rising field, with the huge influence from the performance of AlphaZero (the best chess engine as of now). Among RL’s model-free methods is temporal difference (TD) learning, with SARSA and Q-learning (QL) being two of the most used algorithms. The outline of this post include:Temporal difference learning (TD learning)ParametersQL & SARSAComparisonImplementationConclusionWe will compare these two algorithms via the CartPole game implementation. Temporal Difference Learning (TD Learning)One of the problems with the environment is that rewards usually are not immediately observable. To avoid storing all the state space for games with larger state space, we can use a deep q network.
Liquid Neural Networks in Computer Vision
       
Liquid Neural Networks in Computer VisionIn this post, we will discuss the new liquid neural networks and what they might mean for the vision field Jacob Solawetz Feb 12·4 min readExcitement is building in the artificial intelligence community around MIT’s recent release of liquid neural networks. The Liquid Neural Network DesignThe Liquid neural network is a form of recurrent neural network that processes data in time series. he liquid neural network algorithm in the arxiv paperFor more implementation details, we recommend a deep dive into the liquid neural network paper. The Liquid Neural Network PromiseThe liquid neural network is shown to improve on time series modeling across a variety of domains — human gestures, human activities, traffic, power, ozone, sequential MNIST, and occupancy. ConclusionLiquid neural networks are a new breakthrough in recurrent neural networks, creating a model that adopts flexibly through time.
Essential Math for Data Science: Eigenvectors and application to PCA
       
ESSENTIAL MATH FOR DATA SCIENCEEssential Math for Data Science: Eigenvectors and application to PCA(Image by author)Matrix decomposition, also called matrix factorization is the process of splitting a matrix into multiple pieces. Eigenvectors and EigenvaluesAs you can see in Chapter 7 of Essential Math for Data Science you can consider matrices as linear transformations. Now, let’s create the data matrix X with the two variables: temperatures and consumption. You can see in Figure 7 that the eigenvectors of the covariance matrix give you the important directions of the data. Then, you just need to change the basis of the data using the eigenvectors as the new basis vectors.
How to Predict Stock Prices with LSTM
       
How to Predict Stock Prices with LSTMDisclaimer: I do not believe that any ML and AI model can predict the future price of the stocks. In this tutorial, I provide an example of how you can apply LSTM models for predicting stock prices. In the previous post, we have used the LSTM models for Natural Language Generation (NLG) models, like the word-based and the character-based NLG models. LSTM model for Stock PricesGet the DataWe will build an LSTM model to predict the hourly Stock Prices. In our case, we will predict ahead 251 observations, as many as the test dataset observations.
Detecting Beeps from Brains
       
Either they coincide with the start of a beep, or they are cut from a time away from any beeps. I started by replicating some methods of Neural Activity Classification with Machine Learning Models Trained on Interspike Interval Series Data (Lazarevich et al., 2018). Figure 8: ISI-statistics based approaches of Neural Activity Classification with Machine Learning Models Trained on Interspike Interval Series Data (2018). I again tried many ML models learning on these histograms, including dense neural networks (NNs), 1D convolutional NNs, locally connected NNs, residual NNs, long-short-term-memory NNs, and XGBoost. However in this experiment, the rats were only trained (conditioned) in later sessions to actually respond to the beeps.
MiniRocket: Fast(er) and Accurate Time Series Classification
       
MiniRocket: Fast(er) and Accurate Time Series ClassificationMost state-of-the-art (SOTA) time series classification methods are limited by high computational complexity. MiniRocket (MINImally RandOm Convolutional KErnel Transform) is a (nearly) deterministic reformulation of Rocket that is 75 times faster on larger datasets and boasts roughly equivalent accuracy. Rocket transforms time series by first convolving each series with 10,000 random convolutional kernels. The random convolutional kernels have random length, weights, bias, dilation, and padding. The number of MiniRocket kernels are kept as small as possible, while still maintaining accuracy, in order to maximize computational efficiency.
End to End Deep Learning: A Different Perspective
       
Now, all we need to do is to train the object detection model and we are good to go, right? We always hear terms and phrases like, “Big Data”, “Data is the new gold”, “Data in abundance”, but the fact of the matter is that unstructured and unprocessed data is in abundance. And again, when we start learning about Deep Learning and Data Science in general, we come across datasets like Iris or MNIST, etc. We need a dataset of food ingredients, with bounding box information of different labels across all these images. Hence, an easy alternative to this would be to download the data from different free sources and web searches.
A new contender for ETL in AWS?
       
This makes Glue a great tool if you have large-scale datasets that can be imported using DynamicFrame, but inefficient at best for anything less. (https://docs.aws.amazon.com/glue/latest/dg/dev-endpoint-tutorial-pycharm.html)This doesn’t set up Glue as a good tool for day-to-day data science/machine learning exploration. So, enough teasing - let’s explore why I’m now using Metaflow for my Python ETL jobs. A Flow or step can then have annotations applied to it to add a further configuration. In my previous example of creating an ETL pipeline, I wouldn’t need to store the data into S3 myself as Metaflow would do this for me.
Forgetful pandas ?
       
?In this article I’ll show you a few ways to get the true memory usage for your pandas objects. ⚠️Pandas may or may not be forgetful, but elephants are reported to have excellent memories. The dask family of packages mimics the pandas API and other common Python data science APIs. ?WrapNow you know how to find the real memory memory usage in pandas. ?I write about data science, Python, pandas, and other tech topics.
The Human Bias-Accuracy Trade-off
       
Image Source: Photo by Jon Tyson on UnsplashThe Human Bias-Accuracy Trade-offTo be human is to be Biased? Everything boils to optimizing one particular trade-off — the bias-variance tradeoff; making sure that we get accurate predictions for any part of the population data. For the sake of our analysis, we retained the director-company pairing with the highest total direct compensation of director X in company Y. As seen below, the mean total direct compensation for males is almost double that for females. Thus, the model is only able to explain 38% of the variance in total direct compensation.
Gradio vs Streamlit vs Dash vs Flask
       
Once a model is complete, it likely has to be deployed before it can deliver any sort of value. As well, being able to deploy a preliminary model or a prototype to get feedback from other stakeholders is extremely useful. Recently, there has been an emergence of several tools that Data Scientists can use to quickly and easily deploy a machine learning model. In this article, we’re going to look at 4 alternatives that you can use to deploy a machine learning model: Gradio, Streamlit, Dash, and Flask. Keep in mind that this is an opinionated article and is solely based off of my knowledge and experiences with these tools.
How to Use fastai to Evaluate DICOM Medical Files
       
This article is going to show you some helpful tips to get up-and-running fast on learning about DICOM medical files and the data associated with them. The first question you might be asking yourself (or not if you were Googling “fastai dicom files”): what are DICOM files? If you are used to using fastai you’ll be familiar with a few imports, but note the medical import. To plot an X-ray, we can select an entry in the items list and load the DICOM file with dcmread. If anyone does something great with fastai and/or medical data, I want to hear about it!
Train your own object detector with Faster-RCNN & PyTorch
       
Here’s the table of content:Getting images Annotating Dataset building Faster R-CNN in PyTorch Training InferenceGetting imagesIn order to train an object detector with a deep neural network like Faster-RCNN we require a dataset. AnnotatingThere are plenty of web tools that can be used to create bounding boxes for a custom dataset. For this tutorial, we’ll stick to our heads bounding boxes and delete the eye layer that I showed above. This gives us the following:keys:dict_keys(['labels', 'boxes']) labels:array(['head', 'head', 'head', 'head', 'head', 'head'], dtype='
How to Train a Fine-Grained Multilabel Classification Model on Noisy Data
       
That approach is called Labels Smoothing and was discussed in great detail in the paper When Does Label Smoothing Help?. Despite the noisy training data, the model was able to learn the fashion concepts and generalize well on the unseen user images. SummaryIn this article, I demonstrated how to train a model for multilabel attribute recognition on noisy data using the Fastai library and the DeepFashion Dataset. There is no doubt that the model performance could be significantly improved by improving the quality of the training labels. As we have seen, there are approaches that allow providing operation-ready models while continuously working on the improving quality of the training data.
Visualizing Learning of a Deep Neural Network
       
Visualizing Learning of a Deep Neural NetworkKeras Model Learning (Source: By Author)Deep Learning is generally considered a black box technique because you generally can’t analyze how it is working in the back-end. You can use this visualization for educational purposes or presenting it to others to show them how your model is learning. Deep Replay an open-source python package designed to allow you to visualize a replay of how your model training process is carried out in Keras. deep replay. Importing Required LibrariesAs we are working on creating a deep neural network so we need to import the required libraries.
Increase the Accuracy of Your CNN by Following These 5 Tips I Learned From the Kaggle Community
       
Another method for splitting your data into a training set and validation set is K-Fold Cross-Validation. You then train your model K number of times with a different training and validation set. This way, you make optimal use of all your training data. The first array contains indices from our training data for training. Number of training images per disease category, image by the authorThis is the case with our training data.
Automating Hyperparameter Tuning of Keras Model
       
Automating Hyperparameter Tuning of Keras ModelPhoto by Uriel SC on UnsplashBuilding a model is of no use if you cannot optimize it for a good performance and a higher accuracy. Generally, the model building requires less time than optimizing that model, because during optimization or tuning the model you need to look out for the best parameters which is a time-consuming process. We can automate this process of finding out the best values for the hyperparameters and getting the highest accuracy of the Keras model. In this article, we will be discussing Hyperas which is an open-source python package used for automating the process of Keras Model Hyperparameter Tuning. !pip install -U -q PyDrivefrom pydrive.auth import GoogleAuthfrom pydrive.drive import GoogleDrivefrom google.colab import authfrom oauth2client.client import GoogleCredentials # Authenticate and create the PyDrive client.
pytorch-widedeep: deep learning for tabular data
       
The wide component of a wide and deep model is simply a liner model, and in pytorch-widedeep such model can be created via the Wide class. TabTransformer : Details on the TabTransformer can be found in: TabTransformer: Tabular Data Modeling Using Contextual Embeddings. (Image by Author)The TabMlp is the simples architecture and is very similar to the tabular model available in the fantastic fastai library. References[1] TabNet: Attentive Interpretable Tabular Learning, Sercan O. Arik, Tomas Pfister, arXiv:1908.07442v5[2] TabTransformer: Tabular Data Modeling Using Contextual Embeddings. Alessandro Raganato, Yves Scherrer, Jörg Tiedemann, 2020. arXiv:2002.10260v3[6] Neural Oblivious Decision Ensembles for Deep Learning on Tabular Data.
Prediction Intervals for Deep Learning Neural Networks
       
There are no standard techniques for calculating a prediction interval for deep learning neural networks on regression predictive modeling problems. In this tutorial, you will discover how to calculate a prediction interval for deep learning neural networks. Calculating prediction intervals for nonlinear regression algorithms like neural networks is challenging compared to linear methods like linear regression where the prediction interval calculation is trivial. ... # make predictions with prediction interval newX = asarray([X_test[0, :]]) lower, mean, upper = predict_with_pi(ensemble, newX) print('Point prediction: %.3f' % mean) print('95%% prediction interval: [%.3f, %.3f]' % (lower, upper)) print('True value: %.3f' % y_test[0]) 1 2 3 4 5 6 7 . TutorialsPapersArticlesSummaryIn this tutorial, you discovered how to calculate a prediction interval for deep learning neural networks.
High-Accuracy Covid 19 Prediction from Chest X-Ray Images using Pre-Trained Convolutional Neural Networks in PyTorch
       
Learning Rate: A lower learning rate leads to better convergence but not necessarily the best test accuracy. I felt that the problem was that the test images were not being transformed as the training images were and the untransformed test images were somehow resembling the transformed training images of label “1” (more on this crucial topic later). Then I created pandas data frames to contain the training dataset image names and their labels and the test dataset image names respectively. I then set the learning rate to be 10e-4Setting the Learning Rate (Image by Author)6. Learning rate decay is a parameter used to update the learning rate after each epoch and I set a conservative decay rate of 0.99 as the learning rate is already low.
How to prevent Data Leakage while evaluating the performance of a Machine Learning model
       
How to prevent Data Leakage while evaluating the performance of a Machine Learning modelImage by Chris Ried on UnsplashData leakage during model evaluation occurs when data from the training set passes into the validation/test set. The dataset has no missing values, hence, a hundred missing values are introduced randomly for better demonstration of data leakage. Similarly, the mean and standard deviation used to scale the data are also computed using ‘X_train’. The validation RMSE (with data leakage) being closer to the RMSE on unseen data is just by chance. Hence, using Pipeline for k-fold cross-validation prevents data leakage and provides a better estimate of the model’s performance on unseen data.
Win your colleagues over with Dataiku
       
Dataiku is a fantastic tool for teams of Data Scientist and Analysts that wish to remain lean and accomplish a lot. You may agree on a set of tags, and if your colleagues and yourself stick to them, you could quickly see in a flow what’s what and what’s to consume. Unfortunately, as is often the case with Dataiku, the documentation is horrendous, and will not help you past the absolutely most basic usage. You’ll need to make a drop down menu in the application and let the user chose a sales style. Take a sheet of paper, note all the data inputs that your application requires and the outputs it will produce.
Data Science Year Zero.
       
However, Data Science is still finding its footing in most young organizations, unless they sell a data-driven product. Here, I’ve compiled the learnings (ergo, the mistakes) from my first year working as the sole Data Science personnel atIn no particular order:Photo by Kumpan Electric on Unsplash, a SaaS startup for non-desk workers in the US and Canada. Forgetting the science part from ‘Data Science’ helps ease the pain sometimes. Be adamant: To get things done, you’re going to need people in your corner, preferably senior-folk that back you and your ideas up. Dare to step out of the bubble that’s been created in the Data Science community, and we might just become good leaders somewhere down the line, owing solely to the unique position that Data Science holds in the Product-Business mix.
Azure Synapse Analytics — Introduction
       
Azure Synapse Analytics — IntroductionCredit: ISTOCKPHOTOIntroductionSynapse Analytics is an integrated platform service from Microsoft Azure that combines the capabilities of data warehousing, data integrations, ETL pipelines, analytics tools & services, the scale for big-data capabilities, visualization & dashboards. Features WalkthroughAssuming that you have already set up and configured the Azure Synapse Analytics workspace in your Azure subscription. Synapse Analytics Workspace — Credit: MS Azure Synapse Analytics StudioSetting up and configuration of Azure Synapse Analytics workspace is beyond the scope of this article. Credit: MS Azure Synapse Analytics StudioSetup and Integrate a PipelineThis small demo will show to integrate pipeline tasks in Azure SynapseFrom the Integrate hub of the Synapse Analytics Studio select the Pipeline. Credit: MS Azure Synapse Analytics StudioIntegrate Linked ServicesUse Synapse Analytics Studio to integrate and enable various Linked Services, e.g.
Automatic Open Source-based Data Pipelines? Openshift To the Rescue!
       
In this section we’ll deploy a Kafka cluster, using the AMQ Streams operator, that is offered via the Openshift Operator Hub . Running Presto for Distributed QueryingIn this demo, we’ll use Presto’s ability to query S3 bucket prefixes (similar to tables in relational databases). Now that we have a better understanding of our data structure, we can deploy our Presto cluster. Data Logic PreparationAfter we have all of our infrastructure services ready, we need to create the data logic behind our streaming application. we can query data!
5 Biggest Tips to Juggle Work and Study as Data Scientists
       
Plan your expectations and time wellAlign expectations with your managers and colleagues at work. That’s why you should inform your manager to expect more time learning during working hours occasionally. Batch your Focus TimeProductivity = Time * FocusIf you are a morning person, do not work on intense programming at night. In my case, I usually come to work at ~8.00 am and leave work at ~4 pm. By batching time for work, study, and relationship, you will get more done.
Top 5 Python Data Visualization Libraries
       
Top 5 Python Data Visualization LibrariesPhoto by Federica Campanaro on UnsplashData visualization is an essential building block of data science. Data visualizations help to explore and understand the underlying structure within data and relationships between variables. In this article, we will go over the top 5 Python data visualization libraries. We will create the plots using the Melbourne housing dataset available on Kaggle. rows) of the Melbourne housing dataset.
NLP for Suicide and Depression Identification with Noisy Labels
       
OverviewSDCNL, which stands for Suicide Depression Classification with Noisy Labels, is a method for distinguishing between suicide and depression using deep learning and noisy label correction. However, unlike other sources of data, data from online forums have not been verified as true and accurate. The concept of labels being corrupted or inaccurate in datasets is referred to as noisy labels. The issue of noisy labels has been very prevalent in the image-processing domain of machine learning, but has not been addressed in the NLP field. (Image by Authors)On the left, we can see how the label correction process is quite helpful, as the red curves, which are after label correction, are significantly better than the blue ones which are before label correction.
1x1 Convolution: Demystified
       
1x1 Convolution: DemystifiedShedding light on the concept of 1x1 convolution operation which appears in paper, Network in Network by Lin et al. 1x1 Convolution:As the name suggests, the 1x1 convolution operation involves convolving the input with filters of size 1x1, usually with zero-padding and stride of 1. All this using the 1x1 convolution operation! An interesting line of thought was provided by Yann LeCun where he analogizes fully connected layers in CNNs as simply convolution layers with 1x1 convolution kernels and a full connection table. This is a work in progress but hopefully one can see the point of using a 1x1 convolution here.
CTrL and MNTDP, a new open source benchmark and model for continual learning
       
But developing effective CL models comes with its own challenges. Until now, there have been no effective standard benchmarks for evaluating CL systems across these axes. We have also open-sourced a new CL model, MNTDP, that offers superior performance on a variety of benchmarks, including our own. Learn MoreToy illustration of the MNTDP model when learning from a sequence of three classification tasks derived from the MNIST data set of handwritten digits. We are releasing the CTrL benchmark alongside the MNTDP model in hopes that it will help others reproduce our research and test their own CL systems going forward.
Generating Synthetic Tabular Data
       
Generating Synthetic Tabular DataPhoto by Hayden Dunsel on UnsplashIntroductionIn the previous article, we introduced the concept of synthetic data and its applications in data privacy and machine learning. In this article, we will show you how to generate synthetic tabular data using a generative adversarial network (GAN). Now, let’s check just how similar the synthetic data is to the real data. Conclusion:In this second instalment of the synthetic data series, we look into how to generate synthetic tabular dataset using a CTGAN. Synthetic data unlocks opportunities for data sharing, experimenting, and analysis on a large scale, without disclosing sensitive information.
Implementing Single Shot Detector (SSD) in Keras: Part III — Data Preparation
       
Implementing Single Shot Detector (SSD) in Keras: Part III — Data PreparationLabeling a dog image with LabelImg. Process of preparing training data for SSD Network. In the SSD paper, the authors use different datasets to train/benchmark their SSD network. To convert a training sample into a suitable format for training SSD, we need to perform two main tasks. You will need to refer to match_gt_boxes_to_default_boxes.py for matching bounding boxes to default boxes and encode_bboxes.py for the code to encode bounding boxes accordingly.
Computer Vision on Edge
       
Computer Vision on EdgeA few weeks back when I was window shopping at AliExpress I came across the wonderful Maixduino device. It was claimed to carry RISC V architecture along with a KPU (KPU is a general-purpose neural network processor). Well, given my interest in edge computing I thought of presenting a complete end to end guide for an object detection example. Open Images Database (link)Note on Transfer LearningTransfer learning is the idea that we use a pre-trained model for further specialization. Capturing Images (Image by Author)Training on the DatasetWe want to train our models so that they could run on the MaixDuino device.
How to Beat the CartPole Game in 5 Lines
       
How to Beat the CartPole Game in 5 LinesCartPole is a game in the Open-AI Gym reinforced learning environment. Although it is only 5 lines long, it performs better than any commonly found machine learning method and completely beats the CartPole game. Table of ContentsReview of the Cart-Pole problem Analysis of some simple policies Arriving at the 5-Line solution ConclusionReview of the CartPole problemThe CartPole problem is also known as the “Inverted Pendulum” problem. One game played by the Theta PolicyAnalysis of the Theta PolicyAlthough better a monkey, the Theta Policy is far from satisfactory. But by showing how to breaks the CartPole game in 5 lines, I hope you can appreciate how condense the law of physics is.
AdaBoost Algorithm: Remarkably Capable But With One Interesting Limitation
       
The story covers the following topics:Visual comparison of model predictions between a Single Decision Tree, Random Forest, and AdaBoost. Explanation of how AdaBoost differs from other algorithmsPython code examplesDecision Tree vs. Random Forest vs. AdaBoostLet us start with a comparison of the prediction probability surface made by each of the three models. They all use the same Australian weather data:Target (a.k.a dependent variable): ‘Rain Tomorrow’. Possible values: 1 (yes, it rains) and 0 (no, it does not rain);(a.k.a dependent variable): ‘Rain Tomorrow’. Possible values: 1 (yes, it rains) and 0 (no, it does not rain); Features (a.k.a.
Training your ML model using Google AI Platform and Custom Environment containers
       
Training your ML model using Google AI Platform and Custom Environment containersPhoto by Setyaki Irham on UnsplashGoogle AI Platform allows advanced model training using various environments. This tutorial explains how to set one and train a Tensorflow recommendation model on AI Platform Training with a custom container. My repository can be found here:https://github.com/mshakhomirov/recommendation-trainer-customEnvDocker/OverviewThis tutorial will explain how to train a user-items-ratings recommendation model using WALS algorithm. This guide covers the following steps:Local environment setupWrite a Dockerfile and Create a custom containerRun docker image locallyPush the image to GCP Container RegistrySubmit a custom container training jobSchedule model training with AirFlowPrerequisites:Creating the above mentioned resources will incur costs around $ 0.20. Training datasetOur data for training (repo) will look like this:
Recommendation Systems for Rotating Stores in Video Games (Part Two)
       
Recommendation Systems for Rotating Stores in Video Games (Part Two)Photo by Oneisha Lee on UnsplashI recommend readers to look over the first part of this blog; that post is found here. We generated a quick estimate to understand personalization’s monetary impact on a rotating store. A quick note — since the previous post focused on rotating stores in video games, I will interchange the term users with players often. This throughput can have a wide range; for example, the rotating store could show three things every day, over a thousand every year. On the one hand, there are signs that it can be quite impactful in most rotating store scenarios.
Recommendation Systems for Rotating Stores in Video Games (Part One)
       
Recommendation Systems for Rotating Stores in Video Games (Part One)Photo by Mae Mu on UnsplashPart two of this blog has been finished and is located HERE. Similarities to Clothing RetailersIn theory, many physical clothing retailers act as a rotating store by shifting their displayed inventory to match the seasons. Rotating Stores in Video GamesWhile some clothing retailers run on a structure similar to a rotating store, the biggest adopter of this monetization strategy has been gaming. Recap of Rotating Store PropertiesRotation — the storefront switches purchasable products frequently, and the change cadence is often known. If we had personalized instead, we would have shown the first user the same item and all other users item “C”.
Google AI Blog: Introducing Model Search: An Open Source Platform for Finding Optimal ML Models
       
However, designing NNs that can generalize well is challenging because the research community's understanding of how a neural network generalizes is currently somewhat limited: What does the appropriate neural network look like for a given problem? Techniques like neural architecture search (NAS), use algorithms, like reinforcement learning (RL), evolutionary algorithms, and combinatorial search, to build a neural network out of a given search space. OverviewThe Model Search system consists of multiple trainers, a search algorithm, a transfer learning algorithm and a database to store the various evaluated models. The search algorithms implemented in Model Search are adaptive, greedy and incremental, which makes them converge faster than RL algorithms. ConclusionWe hope the Model Search code will provide researchers with a flexible, domain-agnostic framework for ML model discovery.
Training, debugging and running time series forecasting models with the GluonTS toolkit on Amazon SageMaker
       
Solution overviewWe first show you how to set up GluonTS on SageMaker using the MXNet estimator, then train multiple models using SageMaker Experiments, use SageMaker Debugger to mitigate suboptimal training, evaluate model performance, and finally generate time series forecasts. When you select an algorithm, you can configure the hyperparameters to control the learning process during model training. The Amazon SageMaker Python SDK MXNet estimators and models and the SageMaker open-source MXNet container make writing a MXNet script and running it in SageMaker easier. Creating the MXNet estimatorYou can run MXNet training scripts on SageMaker by creating an MXNet estimator. Debugger automatically captures data from the model training and provides built-in rules that check for conditions such as overfitting and vanishing gradients.
How Zopa enhanced their fraud detection application using Amazon SageMaker Clarify
       
In this post, we use Zopa’s fraud detection system for loans to showcase how Amazon SageMaker Clarify can explain your ML models and improve your operational efficiency. Zopa trains its fraud detection model on SageMaker and can use SageMaker Clarify to view a feature attributions plot in SageMaker Experiments after the model has been trained. Zopa uses SageMaker MMS model serving stack in a similar BYOC fashion to register the models for the SageMaker Clarify processing job. With SageMaker Clarify, Zopa can now produce model explanations more quickly and seamlessly. To learn more about SageMaker Clarify, see What Is Fairness and Model Explainability for Machine Learning Predictions?
Reviewing online fraud using Amazon Fraud Detector and Amazon A2I
       
Amazon Fraud Detector is a fully managed service that uses ML and more than 20 years of fraud detection expertise from Amazon to identify potential fraudulent activity so you can catch more online fraud faster. Solution walkthroughIn this post, we set up Amazon Fraud Detector using the AWS Management Console, and set up Amazon A2I using an Amazon SageMaker notebook. Set up an Amazon A2I human loop with Amazon Fraud Detector. Setting up an Amazon A2I human loop with Amazon Fraud DetectorIn this section, we show you to configure an Amazon A2I custom task type with Amazon Fraud Detector using the accompanying Jupyter notebook. For instructions, see the following:ConclusionThis post demonstrated how you can detect online fraud using Amazon Fraud Detector and set up human review workflows using Amazon A2I custom task type to review and validate high-risk predictions.
How often you bang your head depends on how close you are to the north pole
       
Photo by Jennifer Latuperisa-Andresen on UnsplashHow often you bang your head depends on how close you are to the north poleThis is an ancient technique. Truth be told, I was never an ardent follower of metal but the hard rock subgenre fit right into my alley. USA, UK, and Germany have large populations, so maybe we need to find the number of metal bands per capita. To those who make our heads bangWhat are the most popular metal bands in each country? Greenland, Iceland, Sweden, Norway, and Finland seem to be rocking pretty hard; Closer to the north pole, harder the head banging!
6 Approaches to Speed Up Your Python Code
       
Maybe not as fast as C, C++, or Java, but fast enough for Python code to make it better. Keeping your code up-to-date by using the latest Python release will ensure that you’re using new optimization techniques implemented in the release. Approach №6: Use speedup applicationsVarious applications can be used to speed up your Python code. Numba basically turns Python code into a faster machine code for faster execution. In this article, we went through 6 ways you can use to make your Python code faster.
Fake Job Predictor
       
Fake Job PredictorI spent some time during this lockdown to work on a project — “Fake Job Predictor.” When I was looking for a new job, making a predictor like this seemed to be a good idea. The final model will take in any relevant job posting data and produce a final result determining whether the job is real or not. Fake and Real Jobs Count by StateThe graph above shows that Texas and California have a higher possibility of fake jobs than other states. Fake to Real Job RatioIn California, Bakersfield has a fake to real job ratio of 15:1, and Dallas, Texas, has a ratio of 12:1. What can be seen is that even though the character count is relatively similar for both real and fake jobs, real jobs have a higher frequency.
What On Earth Is The Curse Of Dimensionality?
       
IntroductionAn unfortunate downside to having a Data Science career is that it can be rather difficult to understand data at times. Furthermore, a big problem with newcomers to the field is having a solid understanding of high-dimensional data in the first place. High-dimensional data can not only be problematic for statistics and computers, but also the humans working with them. Fortunately, if you would like to learn more about random projection, I have an article you can check out here:ConclusionThe curse of dimensionality comes into fruition surprisingly frequently in the world of Data Science. The curse of dimensionality is only a curse given to exciting data, but knowing how to work with that data can make it exciting in a whole new way.
Question And Answering With Bert
       
Finding a ModelScreenshot of the HuggingFace models page — we select Question Answering to filter for models trained specifically for Q&ALet’s first find a model to use. If you’re interested in swapping out Bert for Electra, just switch deepset/bert-base-cased-squad2 with deepset/electra-base-squad2 throughout this article. Our Bert tokenizer is in-charge of converting human-readable text into Bert-friendly data, called token IDs. At this point, Bert does some linguistic wizardry to identify the location of the answer within the context provided. Once Bert has decided, it returns us the span of the answer — meaning the start token index, and end token index.
Practical Example of Dimensionality Reduction
       
In this post, I am going to go through four techniques for dimensionality reduction using Python. Before we dive into the practical side, let’s make sure you understand what dimensionality reduction is and why it needs to be done. What is Dimensionality Reduction? With an understanding of dimensionality reduction and why it’s necessary, let’s get started! Recursive Feature Elimination (RFE)The next dimensionality reduction technique is a manual recursive feature elimination method based on the random forest classifier model’s feature importance.
How to Master The Subtle Art of Train/Test Set Generation
       
Suppose we want to predict the price of a diamond using its attributes like carat and the length of its dimensions. As sample data, we will load the diamonds dataset from Seaborn:diamonds = sns.load_dataset('diamonds')diamondsThe dataset has several numerical and three categorical features. For all ML algorithms, there is a single, clearly defined process: divide the data into training and testing sets, fit the model on the training set, and test accuracy on the test set. The splitting task can be done using Scikit-learn’s train_test_split function. The size of the test set is specified with test_size argument - here we are setting aside 30% of the data for testing.
How to Consistently Write in Data Science
       
Something in me just wanted to continue learning about Data Science and contribute to it in some way, so I did just that. Data Science is a very broad field, and when you find a niche that you enjoy writing about, you find yourself enjoying the process much more. Think about it like this, what are you most interested in when it comes to Data Science? Write consistently for a month and see how much easier it becomes to write more. By doing this, you hold yourself accountable to continue writing, and therefore you will continue to write consistently.
Creating a Smart Chat Bot that Talks Like You
       
Creating a Smart Chat Bot that Talks Like YouPhoto by Andrea Piacquadio from PexelsIntroductionIn this article, an approach to creating a chat bot with pre-trained word embeddings and a recurrent neural network with an encoder-decoder architecture is described. As training of such word embeddings requires great effort and huge amounts of data, pre-trained word embeddings are used. As word embeddings for a particular language are needed, the availability of pre-trained word embeddings for the desired language should be determined. Due to the already learned word embeddings, training does not take long and you do not need that much training data. How training data can be obtained and prepared to make a chat bot that talks like you.
How to debug your Deep Learning pipeline with TensorBoard
       
How to debug your Deep Learning pipeline with TensorBoardUMAP segmentation of 10 Hiragana character classes using TensorBoard (Image by Author)As we develop our Deep Learning models we often stumble upon various kinds of problems within our pipeline. TensorBoard is an advanced profiling tool made specifically for Deep Learning pipelines that works with all kinds of data sources (image, video, audio, structured data, etc.). First we load the validation data and extract the predicted classes and probabilities for each batch. TensorBoard hparams: tune your model interactively (Image by Author)In the Table view, we can observe the performance metrics for each hyperparameter configuration. Final thoughtsTensorBoard is generally a great profiling tool to help debug your deep learning pipeline, both on the data and model-level.
1x1 Convolution: Demystified
       
1x1 Convolution: DemystifiedShedding light on the concept of 1x1 convolution operation which appears in paper, Network in Network by Lin et al. 1x1 Convolution:As the name suggests, the 1x1 convolution operation involves convolving the input with filters of size 1x1, usually with zero-padding and stride of 1. All this using the 1x1 convolution operation! An interesting line of thought was provided by Yann LeCun where he analogizes fully connected layers in CNNs as simply convolution layers with 1x1 convolution kernels and a full connection table. This is a work in progress but hopefully one can see the point of using a 1x1 convolution here.
An Inferential Perspective on Federated Learning
       
TL;DR: motivated to better understand the fundamental tradeoffs in federated learning, we present a probabilistic perspective that generalizes and improves upon federated optimization and enables a new class of efficient federated learning algorithms. This is where federated learning comes to the rescue! Broadly, federated learning (FL) allows multiple data owners (or clients ) to train shared models collaboratively under the orchestration of a central server without having to share any data. Each drawing of the plot corresponds to a run of federated optimization from a different starting point in the parameter space. We believe that FedPA is just the beginning of a new class of approaches to federated learning.
Simulated Annealing From Scratch in Python
       
How to implement the simulated annealing algorithm from scratch in Python. Tutorial OverviewThis tutorial is divided into three parts; they are:Simulated Annealing Implement Simulated Annealing Simulated Annealing Worked ExampleSimulated AnnealingSimulated Annealing is a stochastic global search optimization algorithm. If the new point is better than the current point, then the current point is replaced with the new point. Now that we are familiar with the simulated annealing algorithm, let’s look at how to implement it from scratch. How to implement the simulated annealing algorithm from scratch in Python.
Google AI Blog: Mastering Atari with Discrete World Models
       
In contrast, recent advances in deep RL have enabled model-based approaches to learn accurate world models from image inputs and use them for planning. World models can learn from fewer interactions, facilitate generalization from offline data, enable forward-looking exploration, and allow reusing knowledge across multiple tasks. We introduce a new metric that normalizes scores by the world record and clips them to not exceed the record. ConclusionWe show how to learn a powerful world model to achieve human-level performance on the competitive Atari benchmark and outperform the top model-free agents. We see world models that leverage large offline datasets, long-term memory, hierarchical planning, and directed exploration as exciting avenues for future research.
Machine learning on distributed Dask using Amazon SageMaker and AWS Fargate
       
Alternatively there are cloud options such as Amazon SageMaker, Amazon EMR and Amazon Elastic Kubernetes Service (Amazon EKS) clusters. We use a SageMaker notebook with the backend integrated with a scalable distributed Dask cluster running on Amazon ECS on Fargate. Implementing distributed Dask on Fargate using AWS CloudFormationTo provision your resources with AWS CloudFormation, complete the following steps:Log in to your AWS account and choose your Region. Implementing distributed Dask on Fargate using the AWS CLITo implement distributed Dask using the AWS Command Line Interface (AWS CLI), complete the following steps:Install the AWS CLI. Dask ML algorithms integrates with joblib to submit jobs to the distributed Dask cluster.
Random Projection And Its Role In Data Science
       
The Johnson-Lindenstrauss LemmaThe Johnson-Lindenstrauss Lemma is a result pioneered by mathematicians William B. Johnson and Joram Lindenstrauss. The lemma basically iterates that high-dimensional data can be embedded into low-dimensional data with little to no distortion. Needless to say, this technique of performing decomposition on high-dimensional data is incredibly useful. Here is how to import it:from sklearn.random_projection import johnson_lindenstrauss_min_dimRandom Projection TechniquesWhile the lemma that forms the basis for random projection is pretty awesome, what is even more awesome is the techniques that we can actually perform on data. This method can be incredibly useful because it is a relatively basic way to implement the Johnson-Lindenstrauss Lemma that is still very effective.
Kernel Regression in Python
       
Kernel Regression in PythonTable of Contents1 Kernal Regression by Statsmodels2 Kernel regression by Hand in Python3 Conclusion4 ReferencesThis notebook demonstrates how you can perform Kernel Regression manually in python. While Statsmodels provides a library for Kernel Regression, doing Kernel regression by hand can help us better understand how we get to the find result. To begin with, lets looks at Kernel regression by Statsmodels1 Kernal Regression by StatsmodelsWe generate y values by using a lambda function. 1.2 Output of Kernal RegressionThe output of kernel regression in Statsmodels non-parametric regression module are two arrays. fig = px.scatter(x=new_x,y=fun_y(new_x), title='Figure 2: Statsmodels fit to generated data')fig.add_trace(go.Scatter(x=new_x, y=pred_y, name='Statsmodels fit', mode='lines'))2 Kernel regression by Hand in PythonTo do Kernel regression by hand, we need to understand a few things.
7 Points to Create Better Scatter Plots with Seaborn
       
7 Points to Create Better Scatter Plots with SeabornPhoto by Tom Thain on UnsplashData visualization is of crucial importance in data science. In this article, we will go over 7 points to customize a scatter plotin Seaborn library. Scatter plots are mainly used to visualize the relationship between two continuous variables. We will be creating several scatter plots using the Melbourne housing dataset available on Kaggle. We can then go over some tips to make the scatter plots more informative and appealing.
Visualizing the MLP: A Composition of Transformations
       
Basis vectors are independent unit vectors in a coordinate space. Change of BasisTo better understand how basis vectors define a coordinate space, let’s see what happens when we change the basis vectors. Linear LayersImage by AuthorNow that we are armed with some linear algebra fundamentals, let’s look at a basic linear layer. Intuition Behind MLPNow that we’ve covered each component of a linear layer, how do multiple layers behave together? After the line is drawn, the network is rolled backward through inverted operations to the original coordinate space.
When logistic regression simply doesn’t work
       
When logistic regression simply doesn’t workPhoto by Tim Gouw on UnsplashLogistic regression is a very commonly used method for predicting a target label from structured tabular data. Although there are some more advanced methods that generally perform better, the simplicity of logistic regression makes it the first choice. In this post, I would like to show some of the limitations by presenting a very simple use case in which logistic regression doesn’t work well. In such cases, logistic regression (or linear regression for regression problems) can’t predict targets with good accuracy (even on the training data). An interesting thing is that with a little bit more feature engineering we can make the logistic regression work here.
CatBoost regression in 6 minutes
       
ApplicationThe objective of this tutorial is to provide a hands-on experience to CatBoost regression in Python. In this simple exercise, we will use the Boston Housing dataset to predict Boston house prices. But in this context, the main emphasis is on introducing the CatBoost algorithm. model = cb.CatBoostRegressor(loss_function=’RMSE’)We will use the RMSE measure as our loss function because it is a regression task. If you want to discover more hyperparameter tuning possibilities, check out the CatBoost documentation here.
How To Train Your Siamese Neural Network
       
In this article I will discuss a type of model known as a Siamese Neural Network. In short, a Siamese Neural Network is any model architecture which contains at least two parallel, identical, Convolutional Neural Networks. An example embedding space with two classes, crosses and squares, and a yet undetermined class embedding represented by a triangle. Once the models are compiled, we store a subset of the test image embeddings. ConclusionIn this article, we have learned what a Siamese Neural Network is, how to train them, and how to utilise them at inference time.
Customizing NetworkX Graphs
       
Basically, we create another DataFrame where we specify the node ID and node type and use the pd.Categorical() method to apply a colormap. So now our letter nodes are colored blue and our number nodes are colored orange! Image by AuthorNode SizeAltering node size globally is, again, quite simple via a keyword argument in the .draw() method — just specify node_size! Image by AuthorNode Size by Node TypeWe can alter node size by type just like we can for color! So we will build from our node color by type example, but instead of a single keyword argument for node_size we will pass in a list of node sizes referencing the node type used to choose node color.
Audio Deep Learning Made Simple (Part 2): Why Mel Spectrograms perform better
       
Why Mel Spectrograms perform better — this article (Processing audio data in Python. Audio File Formats and Python LibrariesAudio data for your deep learning models will usually start out as digital audio files. Mel ScaleThe Mel Scale was developed to take this into account by conducting experiments with a large number of listeners. Mel SpectrogramsA Mel Spectrogram makes two important changes relative to a regular Spectrogram that plots Frequency vs Time. ConclusionWe have now seen how we pre-process audio data and prepare Mel Spectrograms.
So…How exactly is AI being used to detect COVID-19?
       
Next, let us move our focus to how exactly we can use deep learning to diagnose COVID-19 from the X-ray chest scan. Thus, we tweak the original architecture of the neural network to a convolutional neural network (CNN) which preserves the spatial structure as input. 2x2 kernel (pink) sliding over 4x4 input image (light green) to produce a 3x3 output image (dark green). Deep learning. Generalizability of Deep Learning Tuberculosis Classifier to COVID-19 Chest Radiographs: New Tricks for an Old Algorithm?.
The Deep Learning Inference Acceleration Blog Series — Part 2- Hardware
       
It is no wonder that the market for deep learning accelerators is on fire. In this post, we continue our deep learning inference acceleration series and dive into hardware acceleration, the first level in the inference acceleration stack (see Figure 1). We review several hardware platforms that are being used today for deep learning inference, describe each of them, and highlight their pros and cons. The sole purpose of these custom AI chips is to perform the deep learning operations faster than existing solutions (i.e., GPUs). ConclusionIn this second post of the deep learning acceleration series, we surveyed the different hardware solutions used to accelerate deep learning inference.
Dynamic Pricing using Reinforcement Learning and Neural Networks
       
Data Preparation and System ArchitectureThe dynamic pricing system architecture consists of three fundamental parts. It has two main uses, applying the reinforcement learning algorithm and providing access to data. It processes the data using a trained PyTorch model, saving the results in the database. ResultsIn order to compare the results, both the original e-commerce pricing policy and the trained agent pricing policy were used on the simulator environment. Analyzing the financial results, the Reinforcement Learning agent outperformed the baseline pricing policy by 3.48%.
2021 Habitat Challenge launches to advance embodied AI research
       
Facebook AI is excited to launch the third Habitat Challenge, an open research initiative that invites AI experts around the world to teach machines to navigate real-world environments. Habitat Challenge launches at the 2021 Embodied AI Workshop at the Conference on Computer Vision and Pattern Recognition (CVPR), in coordination with eight other embodied AI challenges supported by 15 academic and research organizations. Three of these research competitions will also be based in Habitat-Sim, supported by Facebook AI researchers and our close collaborators. One of the central AI research challenges today is teaching machines to move through and operate intelligently in complex situations in the physical world. Partners and embodied AI challenges at CVPR 2021:
Solving numerical optimization problems like scheduling, routing, and allocation with Amazon SageMaker Processing
       
In this post, we discuss solving numerical optimization problems using the very flexible Amazon SageMaker Processing API. This pattern is relevant to solving business-critical problems such as scheduling, routing, allocation, shape optimization, trajectory optimization, and others. # shifts[(n, d, s)]: nurse 'n' works shift 's' on day 'd'. SummaryWe used various examples, front ends, and solvers to solve numerical optimization problems using SageMaker Processing. If you currently use SageMaker APIs for your machine learning projects, using SageMaker Processing for running optimization is a simple, obvious extension.
Detecting Pneumonia in Chest X-Ray Images under Orange Machine Learning/Deep Learning Platform
       
This post illustrates the process of developing and comparing different models for binary classification on normal chest X-ray images and pneumonia chest X-ray images in Orange. Chest X-Ray Image DatasetThe dataset is organized into 3 folders (train/val/test) and contains subfolders for each image category (normal/pneumonia jpeg). Read pneumonia chest X-ray Images and extract its features — [Figure 2/left-bottom] “Read Pneumonia” uses “Import Images Widget” to read the pneumonia chest X-ray images from “train/pneumonia”; “Image Embedding Widget” converts the images into its features (the mechanism will be discussed later); “Image Viewer Widget” and “Data Table Widget” are used to examine the images and the features, respectively. Combine the feature lists of normal chest x-ray images and pneumonia chest x-ray images — [Figure 2/center] “Concatenate Widget” combines the feature lists and assigns the sources of images as labels. “Image Viewer Widget” shows some chest x-ray images.
Why Data Scientists Should Become Data Analysts First
       
The second fastest position growth within data science roles went to business and data analysts which increased by 20%.”Data science roles are not growing as fast as data analyst positions. If you lack data science experience consider applying for data analyst jobs because it’s less competitive. The technical requirements for a data analyst is not as high as a data scientist and a person with data science training and a graduate degree will likely be chosen over a candidate without these credentials. Data science adoptionCompanies may not be ready to adopt data science but the demand for data analysts have increased due to an increase in data generation and ease of data access. There is also potential to become a full-time data scientist in the future when there are enough problems available to be solved with data science.
4 Reasons Why Your Machine Learning Model Is Underperforming
       
Generally, the larger the data, the more likely it is to be representative of the target population to which you want to generalize. It is crucial to have a good understanding of the distribution of your target population in order to devise the right data collection techniques. Once you have the data, study the data (the exploratory data analysis phase) in order to determine its distribution and representativeness. Overfitting is when the model fits the training data too closely and cannot generalize to the target population. The more complex a model is, generally, the better it is at detecting the subtle patterns in the training dataset.
How my mother started using CNN for her plants
       
Photo by veeterzy on UnsplashHow my mother started using CNN for her plantsThe lockdown brought out many skills in people; a few turned towards cooking, a few to musical instruments, and a few tried to grow plants. Just when I was thinking of moving to more numerical analysis based on the characteristics of the soil, I found a soil dataset. I use the split-folders package to divide the data(containing subfolders) into the train, test, and validation sets. After defining the train, test, and validation data, I formed a simple CNN model with a series of Convolutional and Maxpooling layers with dropouts in between. Now I’ll evaluate the model on the test data and plot a confusion matrix to check the misclassification.
Outlier detection methods in Machine Learning
       
This article discusses few commonly used methods to detect outliers while preprocessing the data to develop machine learning models. Image by authorIn the above plot, outliers are shown as points below and above the box plot. Image by authorIn the above plot, as seen in the box plot, ‘alcohol’, ‘total_phenols’, ‘od280/od315_of_diluted_wines’, and ‘proline’ have no outliers. Lets look at the implementation of Z-Score method in Python. Image by authorThese are few commonly used outlier detection methods in machine learning.
Can Your AI Have Emotions?
       
While deep learning and artificial neural networks have evolved through the years, they are only biologically inspired by the human brain. These scenarios lead to several intriguing questions that arise and dwell deep within our minds. Some seemingly familiar but nonetheless peculiar queries that we might want to be answered include questions like —Can AI have emotions? Will this AI create a potential threat to human existence? You might have wondered about these mindboggling and intriguing questions at some point in time.
Measuring Distance Using Convolutional Neural Network
       
Measuring Distance Using Convolutional Neural NetworkIn signal processing, it is sometimes necessary to measure horizontal distance between some features of the signal, for example, the peaks. Our goal is however to train a neural network to predict the distance between the peaks. Baseline performance: 0.9999812121197582Using CNN to measure distancesWhen designing a neural network, it is often useful to imaging what would a human operator do. ConclusionWhen designing a neural network is is often useful to imagine how human perception and cognition works. In this toy problem, we represented another example when human activity guides neural network construction.
Photo Finish: Building an AI at Home
       
SpotlightPhoto Finish: Building an AI at HomeA small part of my family’s slide collection. The files in each subfolder are then fed to the neural network training algorithm, which attempts to extract the characteristics that the images have in common. Imagine you wanted to train a neural network to recognise Christmas trees. This technique is so effective that the FastAi library will add image rotation automatically to your datablock configuration through an option called item transforms. This option is used instead of the more commonly used item_tfms=aug_transforms(224,min_scale=0.5) , which would also add image rotation augmentation.
No Free Lunch Theorem for Machine Learning
       
Tutorial OverviewThis tutorial is divided into three parts; they are:What Is the No Free Lunch Theorem? … known as the “no free lunch” theorem, sets a limit on how good a learner can be. Now that we have reviewed the implications of the no free lunch theorem for optimization, let’s review the implications for machine learning. The no free lunch theorem for optimization and search is applied to machine learning, specifically supervised learning, which underlies classification and regression predictive modeling tasks. Specifically, you learned:The no free lunch theorem suggests the performance of all optimization algorithms are identical, under some specific constraints.
Google AI Blog: Rearranging the Visual World
       
Today, we present the Transporter Network, a simple model architecture for learning vision-based rearrangement tasks, which appeared as a publication and plenary talk during CoRL 2020. We are also releasing an accompanying open-source implementation of Transporter Nets together with Ravens, our new simulated benchmark suite of ten vision-based manipulation tasks. Transporter Nets leverage this structure by capturing a deep representation of the 3D visual world, then overlaying parts of it on itself to imagine various possible rearrangements of 3D space. HighlightsGiven 10 example demonstrations, Transporter Nets can learn pick and place tasks such as stacking plates (surprisingly easy to misplace! ConclusionTransporter Nets bring a promising approach to learning vision-based manipulation, but are not without limitations.
Data processing options for AI/ML
       
For data processing and data preparation, you can use either Amazon SageMaker Data Wrangler or Amazon SageMaker Processing for the processing itself, and either Amazon SageMaker Studio or SageMaker notebook instances for data visualization. For a “lift and shift” of an existing workload to SageMaker, SageMaker Processing may be a good fit. The following table compares SageMaker Processing and SageMaker Data Wrangler across some key dimensions. With many built-in transformations and built-in visualizations, DataBrew covers both data processing and data visualization. Spark in Amazon EMRMany organizations use Spark for data processing and other purposes, such as the basis for a data warehouse.
Translating JSON documents using Amazon Translate
       
This post shows you a serverless approach for easily translating JSON documents using Amazon Translate. In this post, we walk you through creating an automated and serverless pipeline for translating JSON documents using Amazon Translate. The function converts the JSON documents into XML, stores them in Amazon S3, and invokes Amazon Translate in batch mode to translate the XML documents texts into the target language. A Lambda function reads the translated XML documents in Amazon S3, converts them to JSON documents, and stores them back in Amazon S3. ConclusionIn this post, we demonstrated how to translate JSON documents using Amazon Translate asynchronous batch processing.
Building an omnichannel Q&A chatbot with Amazon Connect, Amazon Lex, Amazon Kendra, and the open-source QnABot project
       
Amazon Connect is a cloud contact center that provides a seamless experience across voice and chat for customers and agents. They used the AWS open-source Amazon Lex Web UI project, a sample Amazon Lex Web UI that helps provide a full-featured web client for Amazon Lex chatbots. Shortly after expanding the QnABot to their website, OSU-OKC realized that providing more channels for students to interact with didn’t dilute engagement levels. The section Turbocharging QnABot with Amazon Kendra outlines how to integrate QnABot with Amazon Kendra. ConclusionThe team at OSU-OKC is excited to build on the early success they have seen from deploying the QnABot, Amazon Kendra, and Amazon Lex.
Learning Curve to identify Overfitting and Underfitting in Machine Learning
       
This article discusses overfitting and underfitting in machine learning along with the use of learning curves to effectively identify overfitting and underfitting in machine learning models. Typical features of the learning curve of a good fit modelTraining loss and Validation loss are close to each other with validation loss being slightly greater than the training loss. Initially decreasing training and validation loss and a pretty flat training and validation loss after some point till the end. Typical features of the learning curve of an overfit modelTraining loss and Validation loss are far away from each other. Typical features of the learning curve of an underfit modelIncreasing training loss upon adding training examples.
Bootstrap Resampling
       
The trick to bootstrap resampling is sampling with replacement. Bootstrap BenefitsBootstrap and resampling are widely applicable statistical methods which relax many of the assumptions of classical statistics. Bootstrap OverviewWe compute the bootstrap mean as:Bootstrap ExampleTo demonstrate the power of bootstrap I will analyze the means of the heights of different populations from Galton’s height dataset in a Jupyter notebook; famous for giving us the phrase “regression to the mean” w/r to children’s heights. Now we will take many (n_replicas) bootstrap samples and plot the distribution of sample means as well as the mean of the sample means. Apparently, the bootstrap distribution of the difference in means does conform to the CLT, which allows us to trust the statistics we derived from bootstrap resampling the original dataset.
Developing a streamlit-webrtc component for real-time video processing
       
Developing a streamlit-webrtc component for real-time video processingReal-time video processing is one of the most important applications when developing various computer vision or machine learning models. But it also presents a challenge to those of us using Streamlit, since Streamlit doesn’t natively support real-time video processing well yet through its own capabilities. I created streamlit-webrtc, a component that enables Streamlit to handle real-time media streams over a network to solve this problem. One existing approach to achieve real-time video processing with Streamlit is to use OpenCV to capture video streams. The execution model of streamlit-webrtcWe have followed the steps to develop a minimal Streamlit component utilizing WebRTC to stream video.
Can Machine Learning Ever Be “Fair” — and What Does That Even Mean?
       
By using machine learning, the company says, they’ll be able to reduce bias in the hiring manager’s decision making process. So, the company makes a simple fix: they simply won’t include dog breed as a feature in the machine learning model. However, the problem of redundant encodings emerges — features in the dataset that account for the protected attribute. Despite the fact that we need machine learning models to be fair, they need to perform well. A variant of equalized odds, equal opportunity, is now a standard for measuring fairness in machine learning models.
A/B Testing: A Complete Guide to Statistical Testing
       
A/B Testing: A Complete Guide to Statistical TestingPhoto by John McArthur on UnsplashWhat is A/B testing? In this article we’ll see how different statistical methods can be used to make A/B testing successful. P-value (image by author)Now, some care has to be applied to properly choose the alternative hypothesis Ha. nX = 17 users saw the layout A, and then made the following purchases: 200$, 150$, 250$, 350$, 150$, 150$, 350$, 250$, 150$, 250$, 150$, 150$, 200$, 0$, 0$, 100$, 50$. users saw the layout A, and then made the following purchases: 200$, 150$, 250$, 350$, 150$, 150$, 350$, 250$, 150$, 250$, 150$, 150$, 200$, 0$, 0$, 100$, 50$.
AI Regulation: No Longer a Distant Future
       
Regulation of AI will be fast-tracked through the House and SenateWe may not have all the details yet, but the direction and pace are both fairly clear: we can expect regulation to be fast-tracked at the federal level to complement state-level bills. NIST and others will double down on developing benchmarks, standards and measurement frameworks for AI technologies, algorithmic bias, explainability, and AI governance and risk management. Over the next four years, we will see increased collaboration on AI regulation, standards, certification, and auditing with European and international organizations, and with neighboring countries, many of whom are already ahead of their U.S. counterparts. There are three easy steps you can take to avoid surprises down the road and to prepare your organization:· Don’t wait for AI regulation to come to you! And they create the foundation of a minimal viable framework for internal AI governance.
Machine Vision for Facial Enhancement Recognition
       
Machine Vision for Facial Enhancement RecognitionImage by Nojan Namdar via Unsplash — https://unsplash.com/photos/dUtizJyby4EProblem StatementKnowing if a face has been digitally enhanced can be difficult in today’s tech-driven world, creating a spectrum of problems that ranges from fake-news to social-media-fueled mental illness. We were able to achieve 100% accuracy by removing the presence of a pooling layer, at the cost of creating a slow model. In turn, we created a second model where a pooling layer was incorporated. 100% accuracy was reached again, showing that the model may benefit from color but does not rely on it. As mentioned before, we excluded the pooling layer.
Learning from Audio: The Mel Scale, Mel Spectrograms, and Mel Frequency Cepstral Coefficients
       
By now, we have developed a stronger intuition as to what spectrograms are, and how to create them. Simply put, spectrograms allow us to visualize audio and the pressure these sound waves create, thus allowing us to see the shape and form of the recorded sound. The main aim of this article is to introduce a new flavor of spectrograms — one that is widely used in the Machine Learning space as it represents human-like perception very well. As always, if you would like to view the code, as well as the files needed to follow along, you can find everything on my GitHub. Let’s first start by importing all our necessary packages.
Alzheimer Diagnosis with Deep Learning: A Survey
       
Certainly, the ImageNet database has millions of images, while OASIS Brains has MRI and PET data of only 1098 subjects. This problem usually leads to overfitting. So, a common solution for this is to extract several random patches from the images, both two-dimensional [14] and three-dimensional [15]. But probably the most common technique would be to use Transfer Learning [27], [31] – [33], [37], [38]. The negative class is usually over-represented when comparing to the positive class.
Python Packages for NLP-Part 1
       
Python Packages for NLP-Part 1Photo by Micah Boswell on UnsplashNatural Language Processing aims at manipulating the human/natural language to make it understandable for the machine. It deals with text analysis, text mining, sentiment analysis, polarity analysis, etc. There are different python packages that make NLP operations easy and effortless. All NLP packages have different functionalities and operations which makes it easier for end-user to perform text analysis and all sorts of NLP operations. In this series of articles, we will explore different NLP packages for python and all of their functionalities.
A Guide to Streamlit — Frontend for Data Science Made Simpler
       
A Guide to Streamlit — Frontend for Data Science Made SimplerPhoto by Javier Allegue Barros on UnsplashMultiple times in our data science job or even in a cool side project that we dedicate our time and energy to, we face the dilemma of how to showcase our work properly. In brief, this is the library that allows us to build frontend for our machine learning and data science apps by writing all the code in Python. But with this, you will gain the knowledge of using a lot of different streamlit components in your further projects as well. The Streamlit ComponentsLet’s connect our helper function in the load_model.py module with a function in this file:def classify(image_file):pred = perform_prediction(image_file)return predLet’s now get on with the streamlit app. The only thing we need to do now is to link this function inside our start.py.
One of the Few Innovations I’ve Seen From PCs in a Long Time Is From Dell
       
One of the Few Innovations I’ve Seen From PCs in a Long Time Is From Dell #hope Follow Feb 6 · 4 min readA beautiful laptop PC running horrible WindowsFor a long time, I have believed that the Windows ecosystem has been completely stagnant. This is why I have never believed that a thin Windows laptop works great, considering my Workstation can barely do it. So, I was pleasantly surprised to see the Dell Precision Optimizer app:Optimize your PC based on your most used programsGranted this works only for some of the newer Dell Business PCs. It sucks that Dell releases programs like these and does not inform loyal customers who have spent tens of thousands of dollars on Dell Workstation Desktops & Laptops. Do note that I have the latest drivers on my Dell PCs using the Dell Command Update Utility.
11 Genius Cooking Hacks I Wish I Had Known Earlier in Life
       
Insanely tasty pasta sauce, hailed as world’s best, is only 3 ingredients in one pan. In the real world, prepping before cooking and washing after cooking take time. Pasta recipe in Marcella Hazan’s first book, The Classic Italian Cook Book, is one of them. You’ll need one 28-ounce can of whole, peeled tomatoes; one medium peeled onion, halved; and 5 whole tablespoons of butter. At end of 45 minutes, toss out the onion halves, and pour the sauce over your favorite pasta.
This Is What A Harvard Economist Warned Me About A Decade Ago
       
Ten years ago I spoke at a Marketing Technology (aka “martech”) event that was held at the fancy New York Times auditorium; a famous Harvard economist was the keynote speaker. And because you’re so great, you completely wave it off as a bad idea that can never take hold because steam power is here to stay! But this upstart country called the USA doesn’t have the same sunk investment in steam power greatness, and instead adopts electricity to power its factories. Both David and Simon explain how steam power and computing technology are general inventions versus specific ones. Whereas steam power, electricity, and computing all provide a driving force that can be applied in many areas.
I’ve Worked in Game Development My Whole Career — Here’s Why I’m Learning Quantum Computing
       
I’ve Worked in Game Development My Whole Career — Here’s Why I’m Learning Quantum Computing Qiskit Follow Feb 16 · 4 min readBy Amir Ebrahimi — Principal Software Engineer · ‎Unity TechnologiesWhat opened quantum computing up for me was realizing that it’s even more connected to our physical universe than classical computing is. I remembered quantum computing, and decided to hold myself to reading one quantum computing article a week during lunch. While classical computing has afforded some moments like this, such as performing any boolean operation with only NAND gates, quantum computing delivers in spades. I find myself facing even more “aha” moments in quantum computing than I ever had in classical computing. Read previous “Why I’m Learning Quantum Computing” Stories here and here, and get started with Qiskit here!
6 Holidays Black People Refuse to Celebrate, Ranked
       
6 Holidays Black People Refuse to Celebrate, RankedPhoto: RiverNorthPhotography/Getty Images6. All the aggression of March 17, multiplied by the anxiety of the winter holidays. President’s DayEvery president is a war criminal — even the Black ones — so yes, please keep this holiday. If you feel the need to celebrate George Washington so bad, just remember his dentures were likely made from enslaved people’s teeth. Do not even think about pranking Black people unless you want a side-eye at best or some words about your mother at… well, at second-best.
Basics of Time Series with Python
       
Basics of Time Series with PythonPhoto by Isaac Smith on UnsplashTime series analysis is a part of daily activities happening around us with respect to time. What we deal with in time series data? Datetime: It is used for basic functionalities for time and series in python. Calendar: The class calendar object contains many functionalities related to the calendar. index = pd.DatetimeIndex(['2020-1-20', '2020-02-01','2021-01-01','2021-02-01'])index #output:DatetimeIndex(['2020-01-20', '2020-02-01', '2021-01-01','2021-02-01'], dtype='datetime64[ns]', freq=None)Suppose we want to make another series and keep the index series as an index to another series.
Genetic Algorithm for Trading Strategy Optimization in Python
       
Genetic Algorithm for Trading Strategy Optimization in PythonHow can GA help cut down problem space and converge towards a better solution? Incomes genetic algorithm (GA): a probabilistic & heuristic searching algorithm inspired by Darwin’s theory on natural selection that the fittest survive through generations. In this blog, we are going to use GA as an optimization algorithm for identifying the best set of parameters. Remember, this is only a demonstration of the application of GA for optimizing trading strategy and should not be copied nor followed blindly. If you would like to learn more about how to avoid overfitting Genetic Algorithm, here is a sequel to this blog that focuses on some techniques to a more robust Genetic Algorithm:
Get Published With Towards AI
       
Towards AI is the world’s leading multidisciplinary publication focused on science, technology, and engineering with an emphasis on diversity, equity, and inclusion. Example of a curated article in Towards AI, through the Medium Partner Program. A Word on the Medium PaywallOur authors have the freedom to publish with no paywalls or through the Medium paywall. Once you’ve published your first article with us, you are considered a writer with Towards AI and can submit your post directly through Medium. ResourcesTowards AI uses the Associated Press (AP) style:Towards AI uses the IEEE style for citations:Further reading:
Support Vector Machine (SVM) Explained
       
Support Vector Machines (SVM) is a core algorithm used by data scientists. Due to the fact that SVM operate through kernels, it is excellent at solving non linear problems as well. Table of Contents:What is SVM- Support Vectors- Hyperplane- Margin- Support Vectors - Hyperplane - Margin AdvantagesDisadvantagesImplementationConclusionResourcesWhat is SVMSupport Vector Machine is a supervised learning algorithm which identifies the best hyperplane to divide the dataset. Figure 1 : 2 dimensional space, 1 dimensional hyperplane dividing the data into different classes. This can be imagined as the data points either rising or falling while a plane tries to separate them into their appropriate classes.
Naive Bayes Algorithm for Classification
       
Naive Bayes Algorithm for ClassificationClassification is one of the most used forms of prediction where the goal is to predict the class of the record. In this article, we will see how to use Naive Bayes algorithm for multiclass classification problem by implementing in Python. Bayes TheoremBayes theorem provides a way to calculate the probability of a hypothesis given our prior knowledge. P(money|spam) is the probability of mail includes “money” given that the mail is spam. P(spam|money) is the probability of the mail is spam given that mail includes “money” in the text.
Plotting decision trees in-memory
       
Plotting decision trees in-memoryIntroductionDecision trees are inherently interpretable classifiers, i.e. Plotting decision treesThe most widely used library for plotting decision trees is Graphviz. We have to perform 6 operations per image: save DOT file, load DOT file, save image, load image, delete DOT file, delete image. Code and time comparisonCode for both functions (in-memory and with disk operations) is presented below. SummaryIn this article you’ve learned how to plot decision trees with Graphviz without disk operations, using just basic libraries and achieving shorter code, under 10 lines.
What is Stratified Cross-Validation in Machine Learning?
       
What is Stratified Cross-Validation in Machine Learning? Before diving deep into stratified cross-validation, it is important to know about stratified sampling. Implementing hold-out cross-validation with stratified samplingWe’ll implement hold-out cross-validation with stratified sampling such that the training and the test sets have same proportion of the target variable. Implementing k-fold cross-validation without stratified samplingK-fold cross-validation splits the data into ‘k’ portions. Implementing k-fold cross-validation with stratified samplingStratified sampling can be implemented with k-fold cross-validation using the ‘StratifiedKFold’ function of Scikit-Learn.
Shaping and reshaping NumPy and pandas objects to avoid errors
       
The last lines read:ValueError: Expected 2D array, got 1D array instead:array=[2020. Let’s try to follow the error message’s instructions:x.reshape(-1, 1)Reshaping is great if you passed a NumPy array, but we passed a pandas Series. I’ll make the result a capital X, because it will be a 2D array — and that’s the convention for 2D arrays (matrix . They assume you have just one column of text, so they expect a 1D array instead of a 2D array. If you found this article on reshaping NumPy arrays to be helpful, please share it on your favorite social media.
A Summary of the Basic Machine Learning Models
       
Awesome, now that I have you hooked, lets get to it, starting from the very beginning: Linear Regression. Logistic Regression is the brother of Linear Regression that is used for classification instead of regression problems. Decision Trees are very versatile Machine Learning models that can be used for both Regression and Classification. In our case, Random Forests are an ensemble of many individual Decision Trees, the family of Machine Learning models we saw just before. Aside from this, the training procedure is exactly the same as for an individual Decision Tree, repeated N times.
Using Google Cloud services for MLOps (GCP)
       
GCP services to orchestrate ML workflowsNow let’s assume that your MLOps process is defined and that you are finally ready to leverage tools to support your AI practices. AutoML on GCP: From data acquisition to predictionAnother option for MLOps on Google Cloud is to create a pipeline using GCP services. There are fully managed GCP services that you can use to automate data extraction, data preparation, and model training. Let’s take a simple example with four essential GCP services: BigQuery, Cloud Storage, AI Platform, and Cloud Composer. MLOps frameworks on GCPI just covered some simple GCP services options to support basic MLOps processes.
Seeing The World Through Computers’ Eyes (And Why It Matters To You)
       
But here is the million-dollar question to answer: How can we achieve these amazing computer vision capabilities with machine learning? 2nd Approach: Deep LearningWhen thinking about deep learning, many of us think of some deep dark mystery. First and foremost, it’s helpful to understand that deep learning is a subset of machine learning. So why might we want to consider deep learning over traditional machine learning? As compared to traditional machine learning, deep learning requires much more training data and computing power to make it work.
Preventing Black Pain: Deep Learning Illuminates Decades-Long Knee Pain Mystery
       
Preventing Black Pain: Deep Learning Illuminates Decades-Long Knee Pain MysteryA recent deep learning model took steps towards solving a long-running mystery. Specifically, why do Black people suffer more pain from knee osteoarthritis? Instead, this research team trained a convolutional neural network directly on X-rays to predict patients’ pain. Photo from Wikimedia CommonsThe dataset involved 4,172 US patients who had or were at high risk for developing knee osteoarthritis. By better predicting pain, better outcomes result for the patients.
Serving TensorFlow models with TF Serving
       
As, increasing the number of convolutional filters means that we will provide more data to the CNN as it is capturing more combinations of pixel values from the input image Tensor. The trained model has been named SimpsonsNet (this name will be used later while serving the model as its identifier) and its architecture looks like:Finally, once trained we will need to dump the model in SavedModel format, which is the universal serialization format for the TensorFlow models. This format provides a language-neutral format to save ML models that is recoverable and hermetic. It enables higher-level systems and tools to produce, consume and transform TensorFlow models. The resulting model’s directory should more or less look like the following:assets/assets.extra/variables/variables.data-?????-of-?????
A Loss Function Suitable for Class Imbalanced Data: “Focal Loss”
       
Class Imbalance in Computer Vision:In case of computer vision problem, this class imbalance problem can be more critical and here we discuss how the authors approached object detection tasks that lead to the development of focal loss. One stage detectors were fast but the accuracy during that time was around 10–40% of the 2-stage detectors. The authors suggested that class imbalance during training as the main obstacle that prevents the one stage detectors to obtain same accuracy as 2-stage detectors. [2])An example of such class imbalance is shown in the self-explanatory Figure 1, which is taken from the presentation itself by the original authors. Below is the definition of Focal Loss —Focal Loss DefinitionIn focal loss, there’s a modulating factor multiplied to the Cross-Entropy loss.
Cost Efficient Distributed Training with Elastic Horovod and Amazon EC2 Spot Instances
       
Cost Efficient Distributed Training with Elastic Horovod and Amazon EC2 Spot InstancesPhoto by Jason Leung on UnsplashHorovod is a popular framework for running distributed training on multiple GPU workers and across multiple hosts. In part 2 we talk about data distributed training and introduce the challenge of training on multiple spot instances. In part 4 we demonstrate an example of Elastic Horovod on Amazon EC2 spot instances. In AWS these are called Amazon EC2 Spot Instances, in Google Cloud they are called Preemptible VM Instances, and in Microsoft Azure they are called Low-Priority VMs. Part 4: Elastic Horovod on Amazon EC2 Spot InstancesIn this section we will demonstrate an end to end example of Elastic Horovod on an Amazon EC2 cluster.
Excel vs Python: How to do Common Data Analysis Tasks
       
Excel vs Python: How to do Common Data Analysis TasksPhoto by Goran Ivos on UnsplashExcel is the most commonly used data analysis software in the world. In this article, we’ll take a look at some common data analysis tasks to demonstrate how accessible Python analysis can be. Even though Excel is great, there are some areas that make a programming language like Python better for certain types of data analysis. Sorting Data Using Excel — By AuthorIn pandas, we use the DataFrame.sort_values() method. Summing the Data using Excel — By AuthorIn pandas, when we perform an operation it automatically applies it to every row at once.
White Sox Announce 2021 Spring Training Broadcast Schedule
       
Spring Training Games to be Featured on NBC Sports Chicago, ESPN 1000 and Whitesox.comCHICAGO — The Chicago White Sox, NBC Sports Chicago and new radio rightsholder ESPN 1000 have announced the team’s 2021 spring training television, radio and webcast schedule. The White Sox will have six Cactus League games televised from Arizona on NBC Sports Chicago, the exclusive television home of White Sox baseball, with nine additional webcasts streamed on whitesox.com. NBC Sports Chicago also will provide a live stream of its White Sox spring training telecasts to authenticated subscribers via the “MyTeams by NBC Sports” app and at NBCSportsChicago.com. New in 2021, all webcasts will be available on the White Sox official Facebook page and YouTube channel, in addition to whitesox.com. The first spring training live stream occurs Friday, March 5 at 2:05 p.m. CT when the White Sox take on the Seattle Mariners at CR-G.
Slate Star Codex and the Gray Lady’s Decay
       
The skirmish began last June when the semi-pseudonymous Scott Alexander, a Bay Area psychiatrist who had been writing a blog called Slate Star Codex since 2013, abruptly deleted all his posts. The reason: A New York Times technology reporter was working on a story about Slate Star Codex and was insistent on disclosing his real identity. It’s titled “Silicon Valley’s Safe Space” (that would be Slate Star Codex). At Slate Star Codex, it meant that “these things cause me harm” was not enough to shut down discussion of difficult issues such as, say, false accusations of rape. But Slate Star Codex with its intellectual heterodoxy and frequent criticisms of the “social justice” movement was simply too dissonant with what is now the dominant value system at The New York Times.
Node.js Developer Interview Questions
       
Beyond the usual questions about skills, qualifications, experience, etc., what really uncovers a Node.js developer’s understanding of the program is how they answer technical questions regarding Node.js. More importantly, to hire Node.js developers, the more experienced the developer is, the more applied the interview questions should be. However, regardless of experience, the most critical questions to ask Node.js developers during an interview revolve around the following:How Node.js worksChoosing Node.js for development (after all, there are many alternatives, so why Node.js?) Node.js uses a single-threading model which, typically, should render performance slow especially when it comes to executing multiple operations at the same time. Node.js uses callbacks.
6 U.S. Presidents You Most Likely Didn’t Know Were Black, Ranked
       
6 U.S. Presidents You Most Likely Didn’t Know Were Black, RankedPhoto: Raymond Boyd/Getty Images6. Abraham LincolnHonest Abe’s connection to Black folks goes further than his role in ending slavery. Warren HardingLike Lincoln, Warren Gamaliel Harding — yes, the original Warren G — was rumored to have Black ancestry on both sides of his family tree. Rogers’ pamphlet The 5 Black Presidents, Jackson’s dad was allegedly Black and his oldest brother was reportedly sold into slavery. Jefferson also literally owned a slave plantation, enslaving more than 600 Black people in his lifetime, and freeing fewer than 10 along the way.
A Gentle Introduction to Stochastic Optimization Algorithms
       
Examples of stochastic optimization algorithms like simulated annealing and genetic algorithms. Stochastic Optimization Algorithms Practical Considerations for Stochastic OptimizationWhat Is Stochastic Optimization? Now that we have an idea of what stochastic optimization is, let’s look at some examples of stochastic optimization algorithms. Some examples of stochastic optimization algorithms include:Iterated Local SearchStochastic Hill ClimbingStochastic Gradient DescentTabu SearchGreedy Randomized Adaptive Search ProcedureSome examples of stochastic optimization algorithms that are inspired by biological or physical processes include:Simulated AnnealingEvolution StrategiesGenetic AlgorithmDifferential EvolutionParticle Swarm OptimizationNow that we are familiar with some examples of stochastic optimization algorithms, let’s look at some practical considerations when using them. Examples of stochastic optimization algorithms like simulated annealing and genetic algorithms.
Data-Driven Predictive Maintenance In a Nutshell
       
In general, Markov methods model the system degradation as a stochastic process that jump between a finite set of states Φ = {0, 1, …, N}. The primary assumption of the Markov model is that the future system state depends only on the current system state. 3.3 Hidden Markov ModelsHidden Markov models (HMM) are useful for systems whose states can only be indirectly observed and evolves in discrete manners. Stochastic filtering methods emerge from a larger field of study known as data assimilation. Stochastic filtering methods use Bayesian learning to iteratively update the system state and parameters that govern the state evolution as the new measurements become available.
Categorical cross-entropy and SoftMax regression
       
Properties of the categorical cross-entropy for SoftMax regressionGiven an example x, the softmax function can be used to model the probability its belongs to class y = k as followswhere W are the parameters of our model. A convex functionOur goal is to find the weight matrix W minimizing the categorical cross-entropy. It can be shown nonetheless that minimizing the categorical cross-entropy for the SoftMax regression is a convex problem and, as such, any minimum is a global one ! Over-parameterizationBefore moving on with how to find the optimal solution and the code itself, let us emphasize a particular property of SoftMax regression : it is over-parameterized ! Without loss of generality, we can thus assume for instance that the weight vector w associated with the last class is equal to zero.
A Practical Dive into Data Fusion For Self-Driving Cars
       
Then, we’ll dive into a specific case study on “sparse” data fusion for self-driving cars to see how data fusion is used in action. Data fusion can generalize to combining N distinct data sources, so long as each of these data sources are co-referenced/calibrated with one another. Some considerations to think about when using data fusion as part of any intelligent system that utilizes data to make predictions or decisions:What steps of data fusion can be performed online? Data Fusion for Self-Driving CarsExample of data fusion for self-driving cars: fusion of RGB pixels and lidar point clouds. We discussed some example fields and domains where data fusion is used, focusing particularly on the applications of data fusion for self-driving cars.
5 Recommended Articles For Data Scientists (Feb 15)
       
You are in for a treat with this week’s recommended articles for Data Scientists. As usual, there’s been a constant quantity of well written and informative articles from AI writers on the Medium platform. There are two articles you must read in this week’s list. The first article addresses the difference between ML Engineers and Data Scientists' roles — important information for those currently job seeking. And the second article showcases a pragmatic utilisation of data science skills.
Accessing Facebook API for Instagram Business
       
If you want to create a more thorough analysis of Instagram Data, I suggest you access Instagram data and leverage your business accounts using Instagram Graph API. The Instagram Graph API allows Instagram Professionals — Businesses and Creators — to use your app to manage their presence on Instagram. Instagram Business Account Turns your Instagram account into a Brand Account to get insights on stories, posts and followers. Accessing your Instagram Data with Facebook APICreate your Developer AccountBefore specifying the data, You need to register and create a developer account in the Facebook Developer API. Finding your Business ID and TokensYou can find all of your API access controls in Facebook Graph API Explorer.
Unit Testing in Deep Learning
       
We will start with a brief introduction of unit tests, followed by an example of unit tests in deep learning and how to run these both via command line and VS Code test explorer. Improved confidence in the unit itself since if it passes the unit tests we are sure that there is nothing obviously wrong with the logic and the unit is performing as intended. Unit tests in PythonEvery language has its tools and packages available for doing unit testing. It is inspired by JUnit and has a similar flavour as major unit testing frameworks in other languages. VS Code Unit Test fail example (Source: By the author)ConclusionThis concludes the article on unit testing for deep learning.
Pump it Up with CatBoost
       
Pump it Up with CatBoostPhoto by sofiya kirik on UnsplashIntroductionThis article is based on the competition Driven Data® had published about water pumps in Tanzania. In poor households, families often have to spend several hours walking to obtain water from water pumps. A significant part of the water pumps is entirely out of order or practically does not function; the others require repair. It is worth noting the small number of labels for water pumps in need of repair. In the characteristics of water pumps, there is one that shows the amount of water.
How CatBoost encodes categorical variables?
       
Please note that we compute the statistics on the training data and use them equally on the test set. So with this simple example, we show that the target leakage problem is an issue with leave-one-out target encoding. Ordered Target StatisticsFinally, we arrive at the procedure introduced by CatBoost to encode categorical variables. Image by authorNow, once the permutation is done, how is the categorical statistic computed? As a final remark, I briefly mentioned in this article that there is a method called Bayesian Target Encoding.
Master Positional Encoding: Part I
       
We are looking for a positional encoding tensor not a positional encoding network. It means that our positional encoding vector now becomes a positional encoding matrix. First, let’s summarize our current positional encoding matrixThe row-vector v is a vector of sines evaluated at a single position x, but with varying frequenciesEach row-vector represents the positional encoding vector of a single, discrete position. This figure shows the dot product between a particular positional encoding vector representing the 128th position, with every other positional encoding vector. By continually adjusting and altering our guesses to incorporate more desired characteristics, we eventually landed on sinusoidal positional encoding matrix.
11 Genius Cooking Hacks I Wish I Had Known Earlier in Life
       
Insanely tasty pasta sauce, hailed as world’s best, is only 3 ingredients in one pan. In real world, prepping before cooking and washing after cooking take time. Pasta recipe in Marcella Hazan’s first book, The Classic Italian Cook Book, is one of them. You’ll need one 28-ounce can of whole, peeled tomatoes; one medium peeled onion, halved; and 5 whole tablespoons of butter. At end of 45 minutes, toss out the onion halves, and pour the sauce over your favorite pasta.
3 Habits of Incredibly Healthy People
       
3 Habits of Incredibly Healthy PeoplePhoto by LYFE Fuel on UnsplashI would consider myself a generally healthy individual. I exercise regularly, eat well balanced meals with plenty of fruits and vegetables, drink a lot of water, and sleep 7–8 hours each night. Am I one of the healthiest people in the world? Not by a long shot. I’m a constant worrier, I drink too much coffee, I…
I’ll send my M1 Apple back. It’s not ready yet.
       
I’ll send my M1 Apple back. After reading lots of rave reviews about M1 Macs, I decided that, because I was long-overdue a speed boost on my 2013 Macbook Pro, I’d try out a Mac Mini. Read on for basics like finder and system preferences not working, as well as more specialist problems that software developers will suffer. I’m experiencing cognitive dissonance, because there are lots of rave reviews, but it’s cost me a lot of time, money and frustration. Here are some facts that are relevant to anybody considering an M1, especially engineers, scientists and software developers.
What happens if you plank every day
       
By Madison VanderbergI tried to plank for five minutes each day for a month and it was far from what I expected. I popped back into plank position for another 30 seconds, collapsed, tried again, held the plank for 20 more seconds, collapsed, and decided to call it for the day. Instead, he told me to try a side-arm plank, where you start in plank position and then shift your weight to one arm. During week two, I changed up my routine and had a grand revelationAt the start of week two, my go-to plank routine was as follows: a forearm plank that lasted between 60 and 90 seconds, a 45-second side-arm plank on my right side, a short break, a 45-second side-arm plank on my left side, a short break, and then a regular plank for however long I could manage. My magic recipe was as follows: 90-second plank, knees down for five seconds, 60-second side-arm plank, 60 seconds on the other arm, knees down for five seconds, 30-second plank, knees down, then finish up with 30-ish seconds in the standard plank position.
The Elites Have Set the Stage For the Greatest Economic Crisis In History
       
Time after time, over history, speculative manias seemingly emerge out of nowhere, fueling extraordinary booms that prelude devastating busts. But when the 2008 financial crisis came along, the rules of the game changed. Every speculative asset class has been “securitized”: stocks, corporate debt, real estate, fine wine, fine art, vintage guitars. The elites have replaced America’s economy with “the Everything Bubble”, and it’s harder than ever to achieve the American dream. It’s there to make you think the government is fiscally responsible, but it rises every time the system reaches crisis point.
Interaction design is more than just user flows and clicks
       
For our purposes as a designer, working memory (part of short-term memory) is the most relevant. The shortest type of memory is known as working memory, which typically only lasts a couple of seconds during a task. Miller’s Law states that the average person can only keep 5–11 items in their working memory at a time. The working memory required to complete a task within your product is proportional to the MIC burden you impose on your users. Conversely, at no point should your task require the user to hold more than seven items in their working memory at any moment.
Will Scrum Fall Victim to Its Own Success?
       
Then here’s a change to the Scrum Master that feels like a step back:“The Scrum Team consists of one Scrum Master” — Scrum Guide 2020This wasn’t there before. Why is it important for Scrum Teams to have one person as a Scrum Master? The Scrum Guide is clear about it:“The Scrum framework, as outlined herein, is immutable. While implementing only parts of Scrum is possible, the result is not Scrum.” — Scrum Guide 2020I don’t agree this is the case here. Their description is beyond the purpose of the Scrum Guide because they are context sensitive and differ widely between Scrum uses.” — Scrum Guide 2020Non-core rules and practices can be stripped from Scrum and find a place elsewhere.
Three Things in Life That Aren’t Worth The Effort
       
Personal relationships are a vacuumNearly a decade ago, I was sitting on a couch in a marriage counselors office. I think he’d grown tired of me waving my hands around, explaining away all the bad omens in our relationship, why I thought we could fix things. I was clawing to save a relationship the other person didn’t want to be in. I looked back through time with curious fascination, wondering why I tried so hard to save things. When a person, in no uncertain terms, says they want to leave — let them.
10 easy tricks to improve your website design
       
Note: yes the inkspot is smaller than the rectangle, but being pretty sharp and detailed makes it attract more attention. But beware: in a real website you’ll not have that extra space, so it’s just a presentation trick! Be careful: some typefaces are extremely smaller or bigger than others. You should base yourself on standard typefaces like Roboto: if the font is large like a 14–18pt Roboto, it’s perfect for paragraphs. Also, pastel designs are extremely trendy these days, so let’s start experimenting with a new area of our color picker.
Introduction to K-means Clustering: Implementation and Image Compression.
       
Introduction to K-means Clustering: Implementation and Image Compression. K-means clustering is one of the most popular unsupervised clustering algorithms. In this article, I would like to introduce the idea of K-means clustering through its Python implementation and how it can be used for image compression. V isualizing the Color Space using Point CloudsOriginal Color SpaceAs can be seen, our original image is very rich in color. Using K-means clustering algorithm we can generalize all these colors by their corresponding centroids.
Neural Query Expansion for Code Search
       
One of the earlier works in this domain is Neural Code Search(NCS) by Facebook that takes in a natural language query and outputs relevant code snippets. Hence, the author’s propose a query expansion engine called Neural Query Expansion(NQE) which when used along with NCS was able to perform better than just using NCS alone. The NQE model that authors’ propose is an encoder-decoder model that takes in a query X at input, and produces sequence of methods names Y = ⟨y1, . Neural Code Search (NCS) — It is an unsupervised method to match given input query with relevant code segment. They use fasttext to represent words in query and code snippets in some dense semantic hyper-space .
10 Must-Know Python Topics for Data Science
       
10 Must-Know Python Topics for Data SciencePhoto by Navin Rai on UnsplashPython is dominating the data science ecosystem. What I think the top two reasons for such dominance are being relatively easy to learn and the rich selection of data science libraries. Python is a general purpose language so it is not just for data science. If you are using Python only for data science related tasks, you do not have to be a Python expert. They can be considered as base Python for data science.
How I created a real-time Sentiment Analyzer
       
How I created a real-time Sentiment AnalyzerImage by Mohamed Hassan from PixabaySentiment Analysis is a Natural Language Processing technique to predict the sentiment or opinion of a given text. Sentiment Analysis is widely used to predict the sentiment of reviews, comments, survey responses, social media, etc. A sentiment analyzer model can predict whether a given text refers to positive, negative, or neutral sentiment. In this article, we will focus on developing a real-time sentiment analyzer, that can predict the sentiment of speech in real-time using open-sourced Python libraries such as NLTK and TextBlob. For sentiment analysis TextBlob provides two algorithms under the hood for implementation:PatternAnalyzer: (Default) Sentiment analyzer algorithm that uses the same implementation as the pattern library.
50+ Statistics Interview Questions and Answers for Data Scientists for 2021
       
Q: Give some examples of some random sampling techniquesSimple random sampling requires using randomly generated numbers to choose a sample. It might make sense here to use stratified random sampling to equally represent the opinions of students in each department. Exposure : includes clinical susceptibility bias, protopathic bias, indication bias. : includes clinical susceptibility bias, protopathic bias, indication bias. Second, mean imputation reduces the variance of the data and increases bias in our data.
Keras Callbacks and How to Save Your Model from Overtraining
       
Keras Callbacks and How to Save Your Model from OvertrainingPhoto by Karsten Winegeart on UnsplashIn this article, you will learn how to use the ModelCheckpoint callback in Keras to save the best version of your model during training. Then I learned about Keras callbacks, and specifically ModelCheckpoint! We can use the Keras callback keras.callbacks.ModelCheckpoint() to save the model at its best performing epoch. This will save space, but will not save the entire model architecture. Specifically, you learned how to use the ModelCheckpoint callback to save the best version of your model before it over-trains and a few ways to customize the callback.
Classification in Security Operations
       
Security operations centers are constantly solving a set of classification issues. Given some set of input data, security analysts initially must determine if activity is malicious or non-malicious. Classification Decision 5: Mitigation EffectivenessOne day our AI overlords may automatically generate mitigations based on context data. Targeting Machine Learning and Analytics to the ProblemThose unfamiliar with security operations may think that a single ML algorithm could assist and eventually replace security analysts. In fact, the images above and narrowing of security operations to five decisions are oversimplifications of the challenges of security operations.
What Is Neural Network Implementation, Anyway?
       
Part 1 — Importing the required packagesfrom sklearn import datasetsfrom sklearn.neural_network import MLPClassifierfrom sklearn.model_selection import train_test_splitfrom sklearn.preprocessing import StandardScalerfrom sklearn.metrics import confusion_matrixfrom sklearn.metrics import classification_report→ The 1st line is to import a dataset from scikit-learn library itself. → The 2nd line is to import the MLPClassifier (Multi-Layer Perceptron) which is our neural network. → The 3rd line is part of Data Preprocessing to split the entire dataset into training and testing. → The 4th line is part of Data Preprocessing to apply feature scaling to our dataset. → The 6th line is to get the complete classification report consisting of accuracy, f1-score, precision and recall as performance metrics.
Evolving Neural Networks in JAX
       
Evolving Neural Networks in JAX“So why should I switch from to JAX?". We will implement the CMA-ES update equations in JAX. If you feel like you need to catch up on these, checkout the JAX quickstart guide or my JAX intro blog post. We can then use an exponentially scaled update if ||p_σ|| deviates from its expectation:def update_sigma(params, memory):""" Update stepsize sigma. """ We vmap over both the number of different evaluation episodes and all the different neural networks in our current population.
Deepmind releases a new State-Of-The-Art Image Classification model — NFNets
       
Deepmind releases a new State-Of-The-Art Image Classification model — NFNetsPhoto by Boitumelo Phetla on UnsplashOur smaller models match the test accuracy of an EfficientNet-B7 on ImageNet while being up to 8.7× faster to train, and our largest models attain a new state-of-the-art top-1 accuracy of 86.5%. Although the proposed model might be the most interesting bit, I still find the analysis of previous work to be very interesting. Batch normalization solves this mean-shift problem. Essentially, gradient clipping is used to stabilize model training [1] by not allowing the gradient to go beyond a certain threshold. In addition to AGC, they also used dropout to substitute the regularisation effect that Batch normalization was offering.
Setting A Strong Deep Learning Baseline In Minutes With PyTorch
       
Your insurance is based on the number of Ants and Bees that the camera captures. First, let’s install lightning-flash. pip install lightning-flashNow what we need to do is to find a strong pretrained model and finetune on your data. Here we use a model pretrained on Imagenet and adjust the weights using the “freeze_unfreeze” scheme. In just a few lines you’ve managed to get a baseline done for your work.
8 Node.js Projects To Keep An Eye On
       
Cytoscape.jsAn open-source JavaScript library used for visualization and graph analysis which comes with rich and interactive implementation. Cytoscape could be used on Node.js to perform graph analysis on a web server or in the terminal. StrapiAn opensource content management system which a backend only system that provides functions to use with RESTful APIs, the main aim to get and deliver content across all devices structurally. SheetJSA Node.js library that allows you to manipulate spreadsheets in Excel and a lot of other things with Excel, for example, exporting workbooks from scratch and converting HTML tables, JSON arrays into xlsx files that are downloadable. Express.jsOne of the most popular Node.js open-source project that offers immense value because of its efficient HTTP request and even working with JavaScript, a server-side language and out of browser.
The GOP Didn’t Just Acquit Trump, It Endorsed Him
       
The GOP Didn’t Just Acquit Trump, It Endorsed HimSen. Josh Hawley gives Sen. Ted Cruz a thumbs up before the conclusion of former President Donald Trump’s impeachment trial. Trump didn’t mean the election rules were unfair because the Electoral College system undermines the ideal of one person, one vote. Biden drew more than 7 million more votes than Trump and won 306 Electoral College votes to Trump’s 232. The Trump team played footage of elected Democrats and liberal celebrities making jokes Trump’s lawyers suggested were akin to promoting political violence. During this week’s hearings, though, that’s exactly what Trump’s lawyers did.
Kirsten Watson joins Dodger broadcast team as reporter and host
       
The Dodgers announced the addition of Kirsten Watson to their broadcast team as a reporter and host. Watson will contribute to the Dodgers’ SportsNet LA game broadcasts, studio programming and pregame and postgame coverage on television and radio. “I am excited to continue my passion as a storyteller with the Los Angeles Dodgers,” Watson said. Watson hosted Monday Night Football on Channel 5 in the UK, an international broadcast in partnership with NFL Network. Watson also earned her bachelor’s degree from Columbia and was a member of the school’s volleyball team.
House of Cards: The crash of One-Two-GO flight 269
       
Officials work in front of the imposing wreckage of One-Two-GO flight 269 in Phuket, Thailand. (Google)It was on the 16th of September 2007 that 123 passengers and 7 crew boarded One-Two-GO flight 269 for a regular flight from Bangkok to Phuket. In 2010 the One-Two-GO brand was subsumed into Orient Thai, only for Orient Thai to be have its Chinese operations suspended by the Civil Aviation Administration of China due to violations. Orient Thai was suspended again by Thailand in 2017, then briefly resumed service before ceasing operations permanently the following year. (Stuff.co.nz)The crash of One-Two-GO flight 269 could be described as a wakeup call for Thailand’s aviation authorities.
Debug your tests in TypeScript with Visual Studio Code
       
Show in Test Explorer will jump to the entry in the Test Explorer. run tests individual or all at onceIf you navigate through the Test Explorer, you will see that you can run tests at any level in the hierarchy. The Test Explorer can be a great help to debug individual tests (if several fail) one after the other. all tests run without errorsConclusionThis was intentionally a very simplified example of how debugging based on tests can work. Additionally, I wanted to point out the so useful extension Test UI Explorer by Holger Benl for Visual Studio Code.
Using container images to run PyTorch models in AWS Lambda
       
This is where AWS Lambda can be a compelling compute service for scalable, cost-effective, and reliable synchronous and asynchronous ML inferencing. You can package your code and dependencies as a container image using tools such as the Docker CLI. You can then create the Lambda function from the container image stored in Amazon ECR. For a more detailed description about AWS SAM and container images for Lambda, see Using container image support for AWS Lambda with AWS SAM. You can bring your custom models and deploy them on Lambda using up to 10 GB for the container image size.
How to Get the Most of the ML Ensembles
       
Stacking has the same working principle as the voting ensemble. However, stacking can control the ability to adjust the submodels’ predictions sequentially- as inputs to the meta-model, to boost the performance. In other words, stacking generates predictions from each model’s algorithm; subsequently, the meta-model uses these predictions as inputs (weights) to create the final outputs. The superiority of stacking is that it can combine different powerful learners and make precise and robust predictions than any standalone model. You can either implement it from scratch by yourself or use the ML-Ensemble library.
Beautiful decision tree visualizations with dtreeviz
       
In this article, I will first show the “old way” of plotting the decision trees and then introduce the improved approach using dtreeviz . The “old way”The next step involves creating the training/test sets and fitting the decision tree classifier to the Iris data set. Now that we have a fitted decision tree model and we can proceed to visualize the tree. dtreeviz in actionHaving seen the old way of plotting the decision trees, let’s jump right into the dtreeviz approach. ConclusionsIn this article, I showed how to use the dtreeviz library for creating pretty and insightful visualizations of decision trees.
5 Data Science Programming Languages Not Including Python or R
       
5 Data Science Programming Languages Not Including Python or RPhoto by Fatos Bytyqi on UnsplashOne of the essential skills one needs to master to get into any data science branch is programming. If you lookup data science tutorials and programming languages online, 99% of the results will be either in Python or R.Python and R are great options to use when learning or building data science applications. Many great resources focus on how Julia can be useful for data science, including the Julia for Data Science book and the Julia for Beginners in Data Science program on Coursera. Python is not the only reasonable option for data science; however, it is difficult to see any other programming languages since it shines very bright. In this article, I showed you 5 programming languages that you can use for data science, some of which are as cool and fixable as Python.
The AI/ML FDA Plan
       
“You probably have noted by now that this action plan is focusing in the so-called SaMD (Software as a medical device). Traditionally, the FDA reviews medical devices through an appropriate premarket pathway, such as premarket clearance (510(k)), De Novo classification, or premarket approval. Approach for modifications after initial review with established SPS & ACPBesides the predetermined change plan, the SaMD also needs to specify an approach for modifications. Transparency & real-world performance monitoring of AI/ML-based SaMDFinally, the regulatory framework requires AI-based medical devices to address transparency and real-world performance. Advance real-world performance pilots in coordination with stakeholders and other FDA programs, to provide additional clarity.
Detailed Explanation of Simple Linear Regression, Assessment and, Inference with ANOVA
       
Detailed Explanation of Simple Linear Regression, Assessment and, Inference with ANOVAA linear relationship between two variables is very common. This article will explain the very popular methods in statistics Simple Linear Regression (SLR). Simple Linear Regression(SLR)When linear relation is observed between two quantitative variables, Simple Linear Regression can be used to take explanations and assessments of that data further. Slope and intercept are to be determined using the Simple Linear Regression. Regression EquationsThe red dotted line in the graph above is called the Least Squares Regression line.
A Data Science perspective to Automated Valuation Models
       
Our goal is to present a summary of this guide as we understand it from a data science perspective. Automated Valuation Models (AVM) aim in automating this process removing bias and subjectivity from these decisions as much as possible. The European Standards is a guide on the development of property valuation models and specifically what to consider when building such systems. The guide examines four main different approaches, namely, House Price index, single parameter valuation, hedonic models and comparable-based models. The four approaches are the House Price Index (HPI), Single Parameter Valuation, Hedonic model and Comparable-based Automated Valuation Models.
Diving into different GAN architectures
       
they create new data instances from training data. ApplicationsThe use of GAN has been increasing rapidly in the field of fashion, art, science, video games etc. c. LSGAN(Least Square Adversial Network)In a normal GAN, the discriminator uses cross-entropy loss function which sometimes leads to vanishing gradient problems. It is known that training GAN networks is highly unstable. In all of the architectures generators is most used for training the data.
Magic 8 Ball: An app to maximize wins in competitive pool matches (Part 1)
       
It is essentially a conversion formula which takes player skill levels as inputs and yields the race length for each player. The Race Margin, Win % Margin, Skill Margin, Game Margin and AvgPPM Margin variables can be used to predict the Win Margin (see below) for a player pairing. Figure 3: Histogram of the Win Margin variable (in units of racks) for the 20,000 matches in the dataset. We only see weak correlations with the target feature, Win Margin, with the highest coefficient being 0.2 for Win % Margin. Choosing a predictive modelWe want to train a model which uses the Race Margin, Game Margin, Win % Margin, Skill Margin, and AvgPPM Margin variables to predict the value of the Win Margin variable.
How-to Fine-Tune a Q&A Transformer
       
The model must then read both and return the token positions of the predicted answer from the context. We can do this like so:AnswersThis gives us our two datasets split between three lists (each):A corresponding context, question, and answer setOur contexts and questions are simple strings — each corresponds to each other. But, rather than the human-readable text — the data is stored as BERT-readable token IDs. The tokenizer is great, but it doesn’t produce our answer start-end token positions. Each of these is simply a list containing the start/end token positions of the answer that corresponds to their respective question-context pairs.
Uncertainty In Depth Estimation
       
So Why Do We Need AI System To Model Uncertainty? Hence, the objectives of reasoning uncertainty using probabilistic approach are to:Prevent a model from confidently predicting something it has not observed before — epistemic uncertaintyEven if data is observed, data that are inherently ambiguous, noisy should induce a predicted value with its associated uncertainty — aleatoric uncertaintyIn this article, we will discuss the use cases of uncertainty and how it is applied specifically to depth estimation problems. We will then go through certain types of scenarios in which uncertainty should be reasoned and give an overview of related research methods to predict uncertainty. Application Of UncertaintyUncertainty estimation plays many essential roles across domains of machine learning systems. Type of Behaviours a Model Should ExhibitTo name a few instinctively, we expect our model to be less confident in the predicted depth for the following cases:
If You Only Read Two Books in 2021, Read This One Book Twice
       
And here are 3 reasons why this is the first book I recommended to friends and family, whenever they ask me for a recommendation:1. Harari begins his book by pointing out that in our 13.7 billion years of history, human cultures began to take shape only about 70,000 years ago. Since then, three important revolutions have shaped the course of human history. Harari calls them “imagined realities.”Cultural limitations: Harari writes, “Culture tends to argue that it forbids only that which is unnatural. Sure, there’s a lot more depth to these ideas, and, of course, there are plenty more from where they are coming from.
We Found Aliens And They Live Next Door (Possibly)
       
This signal appears to have come from Proxima Centauri and was a narrow band beam of microwaves at 982 MHz. They named the signal BLC1, which is imaginative…Parkes Telescope — WikiCCBut, we have had signals like this in the past, like the Wow! So, not only does this signal seem to come from Proxima Centauri, but it also seems to be coming from its planet, Proxima b! Only once this possibility has been ruled out can we state that aliens live around Proxima Centauri. Artist impression of Proxima Centauri & its planet — WikiCCThis is by far one of the most exciting mysteries in science, a glimpse of a cosmic neighbour!
Why Rep. Stacey Plaskett is Making History
       
Why Rep. Stacey Plaskett is Making HistoryU.S. House of Representative Stacey Plaskett (D-VI) during day two of the second impeachment trial of Donald J. Trump (screengrab from YouTube.com)Audacious. These are words I attribute to U.S. House Rep. Stacey Plaskett (D-VI). In recognition, they are: lead manager Rep. Jamie Raskin (D-MD), Rep. Joaquin Castro (D-TX), Rep. David Cicilline (D-RI), Rep. Madeleine Dean (D-PA), Rep. Diana DeGette (D-CO), Rep. Ted Lieu (D-CA), Rep. Joe Neguse (D-CO), Rep. Stacey Plaskett (D-VI), and Rep. Eric Swalwell (D-CA). That is, Rep. Stacey Plaskett. First, Rep. Plaskett is making history in her own right!
I’m A Lost Girl of ADHD — This Was My Inner Voice
       
In elementary school, I’m labeled gifted. Already a book lover, I create a volunteer position for myself in my elementary school library. “Your science teacher says you’re always looking around during class. I don’t know why my mind roams like a stray dog during math and science, why segments of school have suddenly become hard. I don’t know about the floods of estrogen and progesterone that course through me, creating havoc.
Marin County May Be the Fakest ‘Woke’ Place in America
       
Marin County May Be the Fakest ‘Woke’ Place in AmericaPhoto: Diane Bentley Raymond/E+/Getty ImagesIt was a proud moment for Little Fawn Boland, an attorney who identifies as Native American, when she bought a home in Mill Valley — a ritzy city in the heart of Marin County to the north of San Francisco — with then-partner Keith Anderson. He was basically saying ‘You don’t belong here.’”Marin County is a place of contradiction. For Anderson, a part of what shapes his everyday experience in Marin is the time he spends planning his response to a racist attack. People of Marin have reelected County Sheriff Robert Doyle for years, she mentioned, despite his cooperation with Immigration Customs Enforcement in 2019 to deport immigrants. If we can do that, we can make Marin County a better place for everyone to live.
Padres Announce Non-Roster Invites to Major League Spring Training
       
Padres Announce Non-Roster Invites to Major League Spring Training FriarWire Follow Feb 13 · 2 min readPitchers & catchers scheduled to report February 17; first full-squad workout on February 22SAN DIEGO — The San Diego Padres announced today they have extended Major League Spring Training invitations to 33 players. Right-handers Pedro Avila, Miguel Diaz and Chase Johnson, left-handers Daniel Camarena and Brady Feigl, infielder Ivan Castillo and catcher Webster Rivas also return to the Padres organization in 2021 on minor league contracts. The list of non-roster invitees includes 15 right-handed pitchers, eight left-handers, two catchers, six infielders and two outfielders. Of the invitees, 10 have Major League experience. Below is a complete list of non-roster players who will be in Padres camp:
How to Develop a Neural Net for Predicting Disturbances in the Ionosphere
       
shape [ 1 ] # define model model = Sequential ( ) model . shape [ 1 ] # define model model = Sequential ( ) model . # define model model = Sequential ( ) model . # define model model = Sequential ( ) model . shape [ 1 ] # define model model = Sequential ( ) model .
Google AI Blog: Uncovering Unknown Unknowns in Machine Learning
       
The performance of machine learning (ML) models depends both on the learning algorithms, as well as the data used for training and evaluation. CATS4ML relies on people’s abilities and intuition to spot new data examples about which machine learning is confident, but actually misclassifies. There are two categories of weak spots: known unknowns and unknown unknowns. First Edition of CATS4ML Data Challenge: Open Images DatasetThe CATS4ML Data Challenge focuses on visual recognition, using images and labels from the Open Images Dataset. Participants score on submitted image-label pairs, which means that one and the same image can be an example of an ML unknown unknown for different labels.
Google AI Blog: 3D Scene Understanding with TensorFlow 3D
       
In order to further improve 3D scene understanding and reduce barriers to entry for interested researchers, we are releasing TensorFlow 3D (TF 3D), a highly modular and efficient library that is designed to bring 3D deep learning capabilities into TensorFlow. TF 3D contains training and evaluation pipelines for state-of-the-art 3D semantic segmentation, 3D object detection and 3D instance segmentation, with support for distributed training. In addition, it offers a unified dataset specification and configuration for training and evaluation of the standard 3D scene understanding datasets. Furthermore, we will go over each of the three pipelines that TF 3D currently supports: 3D semantic segmentation, 3D object detection and 3D instance segmentation. 3D Object DetectionThe 3D object detection model predicts per-voxel size, center, and rotation matrices and the object semantic scores.
Expanding automatic machine translation to more languages
       
Providing automatic translation at our scale and volume requires the use of artificial intelligence (AI) and, more specifically, neural machine translation (NMT). Low-resource translation experimentsMost of the translation systems require parallel data, or paired translations to be used as training data. The main challenge we faced in building translation systems for new languages consisted of achieving a level of translation quality that yields usable translations, in the absence of large quantities of parallel corpora. We used BLEU scores (a metric that measures the degree of overlap between the generated translation and a professional reference) to measure translation quality. In the longer term, it also means expanding our supported directions to cover all languages used on Facebook.
AI gets better every day. Here’s what that means for stopping hate speech
       
Throughout 2020, our engineers worked to improve the way our AI systems analyze comments, considering both the comments themselves and their context. Our teams brought together better training data, better features, and better AI models to produce a system that is better at analyzing comments and continuously learning from new data. We expect more improvements to come as this field of AI technology continues to advance. The improvement in these foreign languages came about because a whole package of AI technologies made leaps forward in the past year. One particular area of focus is getting AI even better at viewing content in context across languages, cultures, and geographies.
Glow: A community-driven approach to AI infrastructure
       
This approach allows partners to more rapidly design and optimize new silicon products for AI and ML by leveraging community-driven compiler software. It accepts a computation graph from these frameworks and generates highly optimized code for machine learning accelerators. Hardware partners that use Glow can reduce the time it takes to bring their product to market. Looking aheadWe’d like to acknowledge the tremendous contributions and support of the LLVM community in advancing ecosystem support and adoption for Glow. We’re excited to work with our hardware ecosystem partners to unlock the next steps in AI/ML innovation through Glow.
ONNX expansion speeds AI development
       
Making AI tools interoperableMajor technology companies have fueled AI development by open-sourcing or actively backing various deep learning frameworks. We began a collaboration with Microsoft in September 2017 to launch the ONNX specification with the purpose of making the deep learning ecosystem interoperable. Baidu’s PaddlePaddle – PaddlePaddle (PArallel Distributed Deep LEarning) is a deep learning platform originally developed by Baidu scientists and engineers to use on the company’s own products. The native ONNX parser in TensorRT 4 provides an easy path to import ONNX models from frameworks such as Caffe2, Chainer, Microsoft Cognitive Toolkit, Apache MxNet and PyTorch into TensorRT. We are confident ONNX will continue to grow and find new uses to drive AI development and implementation.
Terribly Perceptive: The perceptron & the XOR
       
Mathematically, the perceptron is not much more than some relatively simple linear algebra and vector calculus. A simple perceptron with two inputs, x1 and x2, and a single output unit, z, with activation y. But this is too simple for us and when else will we find use for a simple perceptron? Look at figure 2 for a toy example: we have a 2 dimensional input vector and 2 associated weights, w_1 and w_2. Gradient descent works by taking the derivative of the error function w.r.t.
Speed up your data science learning process
       
I was not a visual learner, which made it difficult for me to watch and mimic what other people did. Based on your learning style, I will also explain the different approaches you can take to study data science. As a visual learner, watching online courses would greatly benefit your data science learning process. Reading/writing learnersPhoto by Aaron Burden on UnsplashAs the name suggests, reading and writing learners learn best from reading and writing. The traditional school system caters to reading and writing learners, since most of the work done involves writing essays and learning from textbooks.
Heated Discussions: Predicting Conflict Intensity Using Climate Data
       
tl;drClimate change is leading to increased political tensions and, some researchers speculate, is therefore driving increased armed conflict across the world. The project concludes that it is not possible to accurately predict conflict intensity using local climate data. I selected India as a case study because it had the necessary overlap of both conflict and climate data: 15,000+ conflict incidents and 3,800+ weather stations. Patterns in other countries may be stronger, and including data other than just climate data will definitely improve performance. This would mean changing our unit of observation from ‘conflict incidents’ to ‘countries’ or some other unit area.
The Trivial Transformer
       
The Trivial TransformerI’d like to start this article with a short disclaimer: I may have chosen a terrible name for this article. Luckily, the Transformer model is here to save the day! Transformer Concept #1b — Scaled Dot Product Attention + Multi-Headed AttentionThe type of Attention used in Transformers is unique because it computes Attention on itself (self-attention). The Whole StoryThe Architecture, source: Attention is All you Need (original paper)Once upon a time, someone built a tiny multi-million parameter model named a “Transformer”. It has been used widely for top performing modern models, such as BERT (Bidirectional Encoder Representations from Transformers) and GPT (Generative Pretrained Transformer).
5 Must-Know Data Visualizations for Better Data Analysis
       
5 Must-Know Data Visualizations for Better Data AnalysisPhoto by Pascal Bernardon on UnsplashData visualization is an integral part of data science. The relationships among variables, the distribution of variables, and underlying structure in data can easily be discovered using data visualization techniques. In this article, we will go over 5 fundamental data visualization types that are commonly used in data analysis. We will be using the Altair library which is a statistical visualization library for Python. I previously wrote similar articles with Seaborn and ggplot2 if you prefer one of those libraries for data visualization tasks.
Music Generation with ConvLSTMs
       
Music Generation with ConvLSTMsPhoto by Rajesh Kavasseri on UnsplashThis project is a continuation of my other project, because I came up with an idea to generate even better results. With this music at the back of your mind, let’s start constructing the program:Data Preprocessing:The first step to machine learning is the data preprocessing. Note that no validation data is needed as the model doesn’t actually need to 100% accurately predict the next note. It is simply supposed to learn a pattern from the source music, so as to create a plausible next note. Hearing Results:song_length = 106data = X[np.random.randint(len(X))][0]song = []for i in range(song_length):pred = model.predict(data.reshape(1,1,9,106))[0]notes = (pred.astype(np.uint8)).reshape(106)print(notes)song.append(notes)data = list(data)data.append(notes)data.pop(0)data = np.array(data)This function allows for the model to generate its own music, based on its own music.
You Should Master Python First Before Becoming a Data Scientist
       
That being said, I am going to highlight a few reasons why you should learn Python first before learning Data Science. SummaryAs you can see, mastering Python is a crucial step in also learning Data Science. To become a great Data Scientist, there are several key concepts and skills you should acquire beforehand, like statistics and data analytics as well. For this article, I have discussed some of the key reasons why you would want to master Python before learning Data Science. Please feel free to comment down below if you have learned Python first in some way before becoming a Data Scientist.
5 Ways to develop a Sentiment Analyser in Machine Learning
       
Custom Trained Supervised Model:You can train a custom machine learning or deep learning sentiment analysis model. To train a custom sentiment analysis model, one must follow the following steps:Collect raw labeled dataset for sentiment analysis. A positive sentiment score ranging between 0 to 1, depicts a positive sentiment, where 1 being positive sentiment prediction with 100% confidence. Whereas, a negative sentiment score ranges between -1 to 0, where -1 being negative sentiment prediction with 100% confidence. Named Entity based Sentiment Analyzer:Named Entity based Sentiment Analyzer is mainly targetted towards entity or importance words.
Audio Deep Learning Made Simple (Part 1): State-of-the-Art Techniques
       
Over the next few articles, I aim to explore the fascinating world of audio deep learning. What problems is audio deep learning solving in our daily lives. Preparing audio data for a deep learning modelTill a few years ago, in the days before Deep Learning, machine learning applications of Computer Vision used to rely on traditional image processing techniques to do feature engineering. Audio Deep Learning ModelsNow that we understand what a Spectrogram is, we realize that it is an equivalent compact representation of an audio signal, somewhat like a ‘fingerprint’ of the signal. Typical pipeline used by audio deep learning models (Image by Author)So most deep learning audio applications use Spectrograms to represent audio.
Uncertainty In Depth Estimation
       
So Why Do We Need AI System To Model Uncertainty? Hence, the objectives of reasoning uncertainty using probabilistic approach are to:Prevent a model from confidently predicting something it has not observed before — epistemic uncertaintyEven if data is observed, data that are inherently ambiguous, noisy should induce a predicted value with its associated uncertainty — aleatoric uncertaintyIn this article, we will discuss the use cases of uncertainty and how uncertainty estimation is applied specifically to depth estimation problems. We will then go through certain types of scenarios in which uncertainty should be reasoned and give an overview of related research methods to predict uncertainty. It could be used in:Active learning : Uncertainty or its complementary entropy can be an effective technique to pick up useful data from the haystack. : Uncertainty or its complementary entropy can be an effective technique to pick up useful data from the haystack.
Deploying time series forecasting models at scale (Part I)
       
Deploying time series forecasting models at scale (Part I)Deploying machine learning models remains a sticking point for many companies and time series forecasting models are no exception. In this article we will discuss general techniques for taking time series forecasting models to production. We will then more specifically dive into how to get PyTorch models trained using Flow Forecast ready for deployment. In Part II of this article series we will actually describe deploy example models with FastAPI, Docker, Terraform and Kubernetes. Current data ingestion and future versionsAt the moment InferenceMode only supports loading inference data from a CSV.
Federated Learning through Distance-Based Clustering
       
Federated Learning:Machine learning algorithms, especially Deep learning algorithms, need a large dataset to train effectively. That solution was proposed by combining federation with machine learning, and it was called federated learning (FL). In the first step, a machine learning model is chosen on the centralized server. Dynamic Distance-based ClusteringThe second method for clustering is the Dynamic Distance-based clustering method. Using the individual device weights from Phase 2, we cluster them using K-means and distance-based clustering.
The Five Most Likely Problems You’ll Face As A Startup CEO
       
I thought my words exposed my inner feelings, but Irwin and Winston just moved on with the conversation. I kept asking myself, “Would it ever get any easier?”Every stage of your startup journey is going to have its share of challenges. As I walked back to my car, I thought about everything I had been through to get to this point. The list was really long, but I’ll point out five things here:Problem Number One: You will likely be your toughest critic. “Most startups are behind plan, and we’re no different.”Problem Number Two: You’ll likely have to adjust your sales and marketing plans.
3 Key Tasks I Always Do Between 5 and 8 AM
       
3 Key Tasks I Always Do Between 5 and 8 AMPhoto by Nicole Wolf on UnsplashWhen I graduated from college just over a year ago, I felt extremely stressed and overwhelmed by my future. I graduated with a degree in electrical engineering, but I had a deep desire to pursue self-employment and build up my own business and brand. Each morning, I put in a few hours of crucial work. The following three work tasks are ones that ultimately helped me through that chaotic first six months, and they continue to keep me organized and productive today. In sharing them, I hope that you can gain inspiration for new ways to improve your own work efficiency.
Scientists May Have Detected a Signal That Could Change Astronomy Forever
       
Scientists May Have Detected a Signal That Could Change Astronomy ForeverScientists think they may have spied the universe’s “gravitational wave background” after more than a decade of searching. Vice Follow Feb 4 · 4 min readBy Becky FerreiraIMAGE: DENIS POBYTOV VIA GETTY IMAGESIn 2015, scientists snagged the first detection of a gravitational wave, a ripple in the fabric of spacetime. Detectors like Laser Interferometer Gravitational-Wave Observatory (LIGO), which captured the first gravitational wave over five years ago, are built to sense relatively loud, high-frequency waves. In this way, the project is essentially utilizing a natural, galaxy-sized observatory to hunt for tiny light fluctuations that might expose the gravitational wave background. These low-frequency waves may finally reveal whether, and how, supermassive black holes merge together when their host galaxies collide.
12 Mistakes Newbie Web Developers Make
       
Test your web application on multiple browsers. Therefore, be careful when adding many npm packages to your web application. As a web developer, you need to think about SEO right from the moment you start building your web application, not the end of it. For example, when writing HTML, you need to use semantic elements to structure your web page according to the web standard. Start using CSS preprocessor such as Sass/Stylus/Less if you haven’t alreadyThere are many advantages to using CSS preprocessors or styled-components over plain CSS in web applications.
Flutter Is About To Win Over the Web
       
The Dawn of a New EraOn the 6th of August 1991, the web went live to the world. This left us with the web as we have it today, and what does that include? The documentWhen the web first came into existence, people were not using apps as they are today. You start using things like position in CSS to control where your elements are laid out. That’s HTML, JavaScript, and CSS.
Clubhouse Is Suggesting Users Invite Their Drug Dealers and Therapists
       
Clubhouse Is Suggesting Users Invite Their Drug Dealers and TherapistsScreenshot courtesy of the authorWhen you join the fast-growing, invite-only social media app Clubhouse — lucky you! You don’t have to do it, but if you don’t, you lose the ability to invite anyone else to Clubhouse. There are at least two additional ways in which Clubhouse appears to take users’ contact data further than the norm. And so instead, the names you see near the top of your invite list likely belong to entities that people have intentionally avoided inviting, despite their high connectedness. Clubhouse may or may not be the worst offender in terms of how it handles contacts that users upload.
Royals Announce Game Times for 2021 Regular Season Schedule
       
In conjunction with Major League Baseball, the Kansas City Royals announced their 2021 regular season schedule with times today. Opening Day is scheduled for Thursday, April 1 at 3:10 p.m. vs. the Texas Rangers at Kauffman Stadium. The Royals will begin all 81 games at Kauffman Stadium at 10 minutes past the hour, including all weeknight games at 7:10 p.m. Eight of the nine weekday afternoon games at Kauffman Stadium will begin at 1:10 p.m. Twelve of the Royals’ 13 Sunday home games are scheduled to begin at 1:10 p.m.
Dodgers officially sign 2020 NL Cy Young Award winner Trevor Bauer
       
by Rowan KavnerThe reigning National League Cy Young Award winner is officially a Dodger. The dominant season saw him finish sixth in Cy Young Award voting. Beyond possessing an array of high-quality pitches in his arsenal, Bauer also had the best fastball spin rate in baseball during his Cy Young Award-winning 2020 season, according to Statcast. Bauer became the first UCLA Bruin to win the Cy Young Award and was the first pitcher in Reds franchise history to win the award. “I heard a lot about what the Dodgers do from the outside looking in,” Bauer said.
Running multiple HPO jobs in parallel on Amazon SageMaker
       
Because you can run 20 trials and max_parallel_jobs is 10, you can maximize the number of simultaneous HPO jobs running by running 20/10 = 2 HPO jobs in parallel. The above code will allow you to run HPO jobs in parallel up to the allowed limit of 100 concurrent HPO jobs. Queuing HPO jobs with Amazon SQSWhen multiple data scientists create HPO jobs in the same account at the same time, the limit of 100 concurrent HPO jobs per account might be reached. If the 100 concurrent HPO jobs limit isn’t reached, it retrieves messages from the SQS queue and creates HPO jobs as stipulated in the message. To learn more about bringing other algorithms such as genetic algorithms to SageMaker HPO, see Bring your own hyperparameter optimization algorithm on Amazon SageMaker.
Building secure machine learning environments with Amazon SageMaker
       
Third, because building secure ML environments in the cloud is a relatively new topic, understanding recommended practices is also helpful. You can access these workshops on Building Secure Environments, and you can find the associated code on GitHub. Although the workshop is built using SageMaker notebook instances, in this post we highlight how you can adapt this to Amazon SageMaker Studio. Labs 1–3 in the Building Secure Environments and Labs 1–2 in the Using Secure Environments workshops focus on how you can enforce IT policies on your ML environments. For more information, see Amazon SageMaker Experiments – Organize, Track And Compare Your Machine Learning Trainings.
Central Limit Theorem — Clearly Explained
       
(The values which we infer from statistic for the population)Statistic →Sample Standard Deviation S, Sample Mean XParameter →Population Standard Deviation σ, Population Mean μWe draw inferences from statistic to parameter. Standard Error of mean = Standard deviation of sample /sqrt(n)n- sample size[Standard error decreases when sample size increases. The mean of the sampling mean is equal to the population mean. [ For most distributions, n>30 will give a sampling distribution which is nearly normal]Sampling distribution properties also hold good for the central limit theorem. Confidence Interval of Population Mean= Sample Mean + (confidence level value ) * Standard Error of the meanZ → Z scores associated with the confidence level.
K Means Clustering using PySpark on Big Data
       
K Means Clustering using PySpark on Big DataPhoto by Nick Abrams on UnsplashIf you are not familiar with K Means clustering, I recommend going through the article below. K Means clustering on Big Data. When building any clustering algorithm using PySpark, one needs to perform a few data transformations. Image Credits — Developed by the Author using Jupyter NotebookNow that our data is standardized we can develop the K Means algorithm. PySpark uses the concept of Data Parallelism or Result Parallelism when performing the K Means clustering.
Microkernel Architecture for Machine Learning Library
       
Microkernel Architecture for Machine Learning LibraryWhat is Microkernel ArchitectureThe Microkernel Architecture is sometimes referred as Plug-in Architecture. It consists of a Core System and Plug-in components. Th Core System contains the minimal functionality required to make the system operational. The Core System also maintains a plug-in registry, which defines information and contracts between the Core System and each Plug-in component, such as input/output signature and format, communication protocol, etc. Image from O’Reilly: Software Architecture PatternThe pros and cons of the Microkernel Architecture is quite obvious.
Serving ML Models in Production with FastAPI and Celery
       
celery_task_app\tasks.py: Contains Celery task definition, specifically the prediction task in our case. celery_task_app\ml\model.py: Machine learning model wrapper class used to load pretrained model and serve predictions. Most simple tasks can be defined using the task decorator which overrides the run method of Celery’s base task class. By extending the Celery Task object we can override the default behavior such that the ML model is loaded only once when the task is first called. …/churn/result/: (GET): Checks is task result is available in the backend, and returns the prediction if it is.
How I ranked in the top 25% on my first Kaggle competition
       
I want to share these learnings with other early Kagglers so that you can improve your performance on Kaggle competitions. Four suggestions to perform well on Kaggle competitionsUnderstand the dataset Start with a simple model Learn from everywhere. The Data tab in the Kaggle competition space. Now, we will transform the column cont5 using the box-cox method and use it as a feature to the LightGBM model. While this approach did not improve our model’s performance, we did learn something new that was valuable.
Genomics New Clothes
       
In the figure above, you can see that this ratio for genetic variation data matrix (left) is very different from e.g. For example, if N ~ P or N > P we can use traditional Frequentist statistics. the closer P is to N (still providing that N > P) the more the variance estimator is biased. Sparsity of Genomics DataOne of remarkable manifestations of the Curse of Dimensionality in Genomics is the sparsity of Genomics data. Another manifestation of the poor grouping of data points in high dimensions is the very weak correlations between the data points.
Federated Learning through Distance-Based Clustering
       
Federated Learning:Machine learning algorithms, especially Deep learning algorithms, need a large dataset to train effectively. That solution was proposed by combining federation with machine learning, and it was called federated learning (FL). In the first step, a machine learning model is chosen on the centralized server. Dynamic Distance-based ClusteringThe second method for clustering is the Dynamic Distance-based clustering method. Using the individual device weights from Phase 2, we cluster them using K-means and distance-based clustering.
Clustering Villages and Finding Fresh Produce Suppliers in Metro Manila Using K-Means, Foursquare, and Folium
       
I only searched then for the best wet markets to supply villages in Clusters 1, 2, 3, and 5. I got the latitude and longitude of Ayala Alabang and made a search query for “wet markets” near Ayala Alabang which were to be accessed through Foursquare API. I repeated the same process for choosing the wet market suppliers for Clusters 2, 3, and 5. This is so that the wet market supplier could be near an “entry point” of the delivery route for that group. Therefore, I would recommend Poblacion Public Market as the wet market supplier for Cluster 5.
Deploying time series forecasting models at scale (Part I)
       
Deploying time series forecasting models at scale (Part I)Deploying machine learning models remains a sticking point for many companies and time series forecasting models are no exception. This is especially true of deep learning models that often have many moving partsIn this article we will discuss general techniques for taking time series forecasting models to production. We will then more specifically dive into how to get PyTorch models trained using Flow Forecast ready for deployment. Choosing your overall forecast lengthA big bottleneck remains forecasting long length time series data as the model can only predict values as long as its forecast length. Current data ingestion and future versionsAt the moment InferenceMode only supports loading inference data from a CSV.
Data representation in NLP
       
This section will present classic NLP methods such as N-grams (Broder et al. Example:Sentence: « I am an AI researcher trying to explain NLP concepts of data representation»Unigram: (I), (am), (an), (AI), (researcher), (trying), (to), (explain), (NLP), (concepts), (of), (data), (representation)(I), (am), (an), (AI), (researcher), (trying), (to), (explain), (NLP), (concepts), (of), (data), (representation) Bigram or 2-grams: ( I, am), (am, an), (an, AI), (AI, researcher), (researcher, trying), (trying, to), (to, explain), (explain, NLP), (NLP, concepts), (concepts, of), (of, data), (data, representation)I, am), (am, an), (an, AI), (AI, researcher), (researcher, trying), (trying, to), (to, explain), (explain, NLP), (NLP, concepts), (concepts, of), (of, data), (data, representation) Trigram or 3-grams: (I, am, an), (am, an, AI), (an, AI, researcher), (AI, researcher, trying), (researcher, trying, to), (trying, to, explain), (to, explain, NLP), (explain, NLP, concepts), (NLP, concepts, of), (concepts, of, data), (of, data, representation)Each of these n-grams is assigned a probability of existence P(w|h) where w is the next word and h the history (this is the number of times w is present in the corpus). Fig.1: Representation of the transition from labeled data to a One-Hot encoding representation (Chollet 2017)Fig. The feature vector or word vector represents different aspects of the word associated with a vector space. ConclusionThrough this article, we reviewed different approaches for data representation in the NLP world.
CNN cheatsheet — the essential summary (Part 2)
       
The goal is to train a generator, a CNN for image tasks that would generate images replicating those in the pre-specified distribution. Allowing for varied size imagesUsing the same CNN on images of varied size is desirable. Taking spatial relations into accountA typical CNN is capable of extracting the relationship between parts of the image far apart. x is the input signal (image, sequence, video; often their features) and y is the output signal of the same size as x. In Network In Network, the convolution is replaced with a ‘micro network’ — a nonlinear multilayer perceptron.
Music Genre Recognition Using Convolutional Neural Networks- Part 2
       
Now in this part we will develop a web app using the awesome Streamlit library. I’m just going to build a basic web app and you can further improve it by adding your own style and creativity. def convert_mp3_to_wav(music_file):sound = AudioSegment.from_mp3(music_file) sound.export("music_file.wav",format="wav")If you remember, we trained our model for recognising genre using audio clips of 3 sec. You have just deployed an app using Amazon EC2 Instance. ConclusionThis was the last part of Music Genre Recognition using Convolutional Neural Networks.
How To Find Stocks That Go Up 1,000% Before Everyone Else
       
The higher a stock’s market cap, the more difficult it is to further achieve “to the moon” type returns. Before we can find stocks that go up 1,000% in the future, we have to get better at finding under the radar companies. There are plenty of under the radar stocks that will skyrocket in a few years, but there are also plenty of under the radar stocks that will plummet or stay flat for many years. You don’t want to compare e-commerce stocks with financial stocks because the e-commerce stocks would then always look overvalued. This is why you should invest in multiple under the radar stocks instead of just loading up on a single under the radar stock.
Why among the People You Date the Most Attractive Ones Tend to Be Jerks
       
In other words, we’ll assume that of all the people available to you in the dating pool, we only care about two things: how much of a jerk they are and how physically attractive they are. This means that there’s a whole group of people you are simply unwilling to date. How exactly your line divides the dating pool depends on how tolerant you are of mean people and how much you care about their looks. Now, if you’re like most people, you’re probably gawking at the gorgeous sweeties in the upper left corner. It’s just that your own romantic filter eliminates ugly jerks and your competition filters out the people who are gorgeous and nice.
On Anti-Asian Hate Crimes: Who Is Our Real Enemy?
       
Between March and August of 2020, Stop AAPI Hate received over 2,583 reports of anti-Asian hate crimes nationwide, and these incidents go grossly underreported. In light of these complexities, here is how everyone can help:Acknowledge, amplify, and denounce the ongoing anti-Asian hate crimes. We must be principled in our anger and channel it to dismantling the real enemy: white supremacy culture that creates the either/or binary and scarcity mindset that has left us fighting each other for the scraps. Do you know them?” Interrupt the active and persistent erasure of Black and Asian solidarity work. I am seeing many Bay Area based, Black-led organizations publicly denouncing anti-Asian hate crimes while supporting community-based solutions that will keep all local communities safer.
Expert Says Everyone Should Upgrade to a Medical-Grade Mask
       
Expert Says Everyone Should Upgrade to a Medical-Grade MaskPhoto: engin akyurt/UnsplashMore transmissible coronavirus variants are spreading, which means it’s time to up your mask game. Chan School of Public Health and adviser on President Joe Biden’s Covid-19 advisory board, said he hoped to see more people use medical-grade masks. “Number one, I would recommend that we move to medical-grade masks,” he said. N95 masks, the “gold standard” of protection, are also considered medical-grade masks. Last month, Robert Roy Britt wrote a comprehensive piece in Elemental on how to upgrade your mask, covering medical-grade masks and all other types.
How to Use a Cat Filter Like the One in the Viral Zoom Courtroom Video
       
Unsurprisingly, Ponton’s Zoom fail became a meme within hours, and a search for “lawyer Zoom cat” on Twitter already returns thousands of results. I also confirmed that the cat filter is extremely challenging to switch off. Finally, open Zoom (or the video chat program of your choice), start your video, and select Snap Camera as your video source. Now if only we could somehow add a cat filter to the impeachment hearings. Update: This story has been updated to include new information about the Dell software that includes the original cat filter.
How to Use Optimization Algorithms to Manually Fit Regression Models
       
Nevertheless, it is possible to use alternate optimization algorithms to fit a regression model to a training dataset. Tutorial OverviewThis tutorial is divided into three parts; they are:Optimize Regression Models Optimize a Linear Regression Model Optimize a Logistic Regression ModelOptimize Regression ModelsRegression models, like linear regression and logistic regression, are well-understood algorithms from the field of statistics. Optimize a Linear Regression ModelThe linear regression model might be the simplest predictive model that learns from data. We can tie all of this together and demonstrate our linear regression model for regression predictive modeling. Optimize a Logistic Regression ModelA Logistic Regression model is an extension of linear regression for classification predictive modeling.
Accelerating the deployment of PPE detection solution to comply with safety guidelines
       
Finally, it saves the frames in an Amazon Simple Storage Service (Amazon S3) bucket for ML model inference. Finally, it saves the frames in an Amazon Simple Storage Service (Amazon S3) bucket for ML model inference. To test the pipeline, upload a .jpg containing people and faces to input/[camera_id]/ folder in the S3 bucket. Make sure you have a similar folder structure in the deployed S3 bucket for the code to work correctly. S3 DataLake Bucket -- input/ ---- [camera_id]/ -- output/ ---- ppe/ ------ [YYYY]/[MM]/[DD]/ -- error/ ---- ppe/For example, this sample image is uploaded to the S3 Bucket location: input/CAMERA01/ .
Training and deploying models using TensorFlow 2 with the Object Detection API on Amazon SageMaker
       
For example, GluonCV, Detectron2, and the TensorFlow Object Detection API are three popular computer vision frameworks with pre-trained models. In this post, we use Amazon SageMaker to build, train, and deploy an EfficientDet model using the TensorFlow Object Detection API. We install the TensorFlow Object Detection API and the sagemaker-training-toolkit library to make it easily compatible with SageMaker. Training a TensorFlow 2 object detection model using SageMakerWe fine-tune a pre-trained EfficientDet model available in the TensorFlow 2 Object Detection Model Zoo, because it presents good performance on the COCO 2017 dataset and efficiency to run it. SummaryIn this post, we covered an end-to-end process of collecting and labeling data using Ground Truth, preparing and converting the data to TFRecord format, and training and deploying a custom object detection model using the TensorFlow Object Detection API.
Text pre-processing: Stop words removal using different libraries
       
Some of the libraries used for the removal of English stop words, the stop words list along with the code are given below. ✍️Yes, we can also add custom stop words to the list of stop words available in these libraries to serve our purpose. Here is the code to add some custom stop words to NLTK’s stop words list:sw_nltk.extend(['first', 'second', 'third', 'me'])print(len(sw_nltk))Output:183We can see that the length of NLTK stop words is 183 now instead of 179. I am pointing this because NLTK returns a list of stop words while the other libraries return a set of stop words. ?We have observed in this article that different libraries have a different collection of stop words and we can clearly say that stop words are the most frequent words used in any language.
Hands-On How to Benchmark Your Timeseries Prediction Model
       
If you want to easily determine the performance of the model without thinking about the range value, you can just rescale the range of the value to the scale you want. What we will try to model is a wind speed prediction. Actually, when this article was writen there is no existing standard model or method (except using numerical weather prediction) to predict wind speed. These are chunks of wind speed timeseries data. Kecepatan angin is wind speed, arah angin is wind direction, and waktu is time.
A sub-50ms neural search with DistilBERT and Weaviate
       
In this tutorial, we will leverage the ability of the vector search engine Weaviate to run any model in production. Weaviate Vector Search EngineWeaviate is a real-time vector search engine that is both very fast at query time as well as suitable for production uses. In this tutorial we will cover the following steps:First, we will spin up the Weaviate Vector Search engine locally using docker-compose. Finally we will use the DistilBERT model one more time to vectorize a search query and then perform a vector search with this. Vectorize a search term with DistilBERT and perform a vector search with WeaviateWe can now reuse the text2vec function we defined above to vectorize our search query.
Reinforcement Learning Lock N’ Roll
       
Reinforcement Learning Lock N’ RollImage by authorLock N’ Roll was created in 2009 by Armor Games. I am a Data Scientist trained in the art of Python programming, but I don’t have much of a background in reinforcement learning. Under the umbrella of reinforcement learning techniques, there is Deep Q Learning (DQL). Double Deep Q Learning (DDQL) becomes necessary when the population of available moves and the variables needing to be considered reach a certain size and complexity. Depending on which move it decided to take, the next available die (sorted alphabetically) was placed on the available space closest to what was chosen.
Celebrating Chinese New Year with CycleGAN
       
Celebrating Chinese New Year with CycleGANCycleGAN CNY dragon outputs (image by author)Friday, February 12th marks the start of Chinese New Year in 2021. Prior to this work, image-to-image translation required large datasets of paired images, before and after the desired image translation is applied. With CycleGAN, you just need a dataset A of summer landscapes and a dataset B of winter landscapes and the model will learn the image translation — pretty impressive! I illustrated how CycleGAN uses my datasets at a high level to learn image translation in the figure below. Instead, it is very important to visualize results in order to track progress towards your desired image translation.
How To Identify A Clever Data Scientist In 7 Minutes Or Less
       
A clever Data Scientist is an effective communicator. Nonetheless, a clever Data Scientist recognises the importance of cultivating a stable relationship with clients, team members, customers, and managers. Steve Jobs was a great storyteller and his storytelling skills gave plain hardware a soul. Clever Data Scientists do the same with data. Take, for example, Data Scientists on Medium.
A Learning Path To Becoming A Natural Language Processing Expert
       
A Learning Path To Becoming A Natural Language Processing ExpertPhoto by Etienne Girardet on UnsplashLike any other data science branch, there are thousands of learning materials for natural language processing online. Math is a vast field itself, so what aspects of math play a role in natural language processing? It will make a big difference if you understand machine learning basics before you dig even deeper into natural language processing. A useful resource for learning machine learning basics is the MIT machine learning course or this course offered by CodeAcademy. Over the past years, the advances in technology allowed various natural language processing techniques to be implemented efficiently and used widely.
Leading a Data Science Project from Scratch
       
Leading a Data Science Project from ScratchPhoto by Franki Chamaki on UnsplashIf you are new to leading a project in data science, you will have many questions despite having gone through the same steps one too many times as an intern, or an engineer in the team. When it comes to leading a project, you need a bird’s eye view into what makes your project a good product. In order to give you a blueprint for the organization of your upcoming project lead role, here are somethings that are common to every data science project from the scratch. Photo by Avinash Kumar on UnsplashAfter cleaning your dataset, usable data might be much slimmer than the original data set. Then, validate your model with more data or data you freshly collected since your data collection process ended.
Are your AI jobs reproducible?
       
Are your AI jobs reproducible? Many factors impact training performance from all over the infrastructure, software, and runtime tunables. Examining a matrix of previous training run’s settings helps find the path to faster “time to accuracy.”2. Only by comparing apples to apples can teams get a reasonable idea of the performance impact that changing a single component will have. SummaryIf you test your DL training performance, you should think of it as a snapshot in time.
How deep learning can solve problems in high energy physics
       
How deep learning can solve problems in high energy physicsPhoto by Yulia Buchatskaya on UnsplashAn interesting branch of physics that is still being researched and developed today is the study of subatomic particles. Scientists at particle physics laboratories around the world will use particle accelerators to smash particles together at high speeds in the search for new particles. In this article, I will demonstrate how you can use the HEPMASS Dataset to train a deep learning model that can distinguish particle-producing collisions from background processes. This field is often referred to as high energy physics because the search for new particles involves using particle accelerators to collide particles at high energy levels and analyzing the byproducts of these collisions. Please refer to this GitHub repository to find the full code used in this article.
Keeping Up with PyTorch Lightning and Hydra — 2nd Edition
       
Keeping Up with PyTorch Lightning and Hydra — 2nd EditionShort note on the 2nd edition: Back in August 2020, I wrote a story about how I used PyTorch Lightning 0.9.0 and Hydra’s fourth release candidate for 1.0.0 to shrink my training script by 50%. So I decided to write this 2nd edition of my original post to “keep up” with PyTorch Lightning and Hydra. PyTorch Lightning 1.1After months of hard work, the PyTorch Lightning released 1.0 in October 2020. If you want to learn more about PyTorch Lightning in general, check out the Github page as well as the official documentation. ConclusionSince I wrote my first blog post about Leela Zero PyTorch, both Hydra and PyTorch Lightning have introduced a number of new features and abstractions that can help you greatly simplify your PyTorch scripts.
Deformable Convolution and Its Applications in Video Learning
       
Deformable Convolution and Its Applications in Video LearningConvolution layer is the basic layer in convolution neural networks. Although it is widely used in computer vision and deep learning, it has several shortcomings. Moreover, since the receptive field of an output pixel is always a rectangle, as a cumulative effect of layered convolutions, the receptive field is getting bigger in which some context background unrelated to the output pixel will be contained. The unrelated background will bring noise to the training of the output pixel. Fortunately, it is already implemented and the name of the refined convolution layer is called deformable convolution layer.
CNN cheatsheet — the essential summary (Part 1)
       
I am also making it a CNN cheatsheet for my future self to go back and revise whenever the need may arise. In contrast to fully connected networks where every input is connected with every neuron in a subsequent layer, CNNs use convolutions as the base operation. In batch normalization, along with the weights in the network, we train for the scale and shift parameters (γ and β). Like batch normalization, we also give each neuron its own adaptive bias and gain applied after the normalization but before the non-linearity. Unlike batch normalization, layer normalization performs precisely the same computation at training and test times.
10 Great Figma Plugins for Designers
       
Figma is a decent tool, but there’s one thing that makes it a great tool, Figma Plugins. Figma is growing insanely great as an awesome collaboration forum for designers, so I started exploring the plugin field and I created my own list of plugins, and these plugins made my design workflow faster and better. Personally, I think Figma is a fantastic design tool, but when I tried Figma plugins for the first time, it made me a big fan of Figma, I tried a lot of Figma plugins and today I’m going to share some of the best and most important plugins, so without wasting any time, let’s dive into it. Many designers express their love for this plugin, so it’s one of the most installed plugins on Figma. Like I said I’ve used a lot of Figma plugins.
10 Space Pictures That Look So Good You Won’t Believe They’re Real
       
This 20-year time-lapse of stars near the center of our galaxy comes from the ESO, published in 2018. About 50 dark plumes mark what are thought to be cryovolcanoes, with those trails being caused by the phenomenon colloquially called ‘black smokers.’ (NASA / VOYAGER 2)6.) Saturn’s moon Iapetus is two dramatically different colors. Jupiter’s moon Io, with (then-)active volcanoes Loki and Pele, is eclipsed by Europa as viewed from Earth in 2015. It is part of a binary star system, where one member is ejecting the hydrogen gas in the post-AGB phase.
U.S. House of Representatives Files Reply to Former President Trump’s Impeachment Trial Brief
       
Washington, D.C. — Today, the House Impeachment Managers, on behalf of the U.S. House of Representatives, filed with the Secretary of the Senate a Reply Memorandum to the Trial Memorandum filed by former President Donald J. Trump in his impeachment trial. In their Reply, the Managers respond to former President Trump’s meritless legal arguments and baseless assertions. President Trump appears to borrow his argument from a similar one made (unsuccessfully) during the impeachment trial of President Clinton. But any comparison to President Clinton’s impeachment does not help President Trump here. Click here to read the U.S. House Reply Memorandum to the Trial Memorandum filed by former President Donald J. Trump in his impeachment trial.
The End Of The COVID-19 Pandemic
       
After more than a year of the pandemic, everyone is ready for this to end. One thing we can say with some certainty is that COVID-19 is probably never going away entirely. Source: PexelsThe caveat to this pessimism is that it is probably possible to eradicate COVID-19. The pandemic might live on, but it will finally be something we can live with, rather than something that rules our lives. Maybe not this month, maybe not even this year, but in the near future we will probably see and end to COVID-19 as it tails off to become just another common cold.
Tweet Topic Modeling Part 1 : Using Twint to Scrape Tweets
       
PART 1: Using Twint to Scrape Tweets from TwitterOne of the most important and first steps of any machine learning or data analysis task is getting the data you need. Installing and using TwintTwint is an advanced Twitter scraping tool written in Python that allows for scraping tweets from Twitter accounts without using Twitter’s API. To read more on the benefits of using Twint, check out their useful Github below. Here we will iterate through all our users and store the resulting records into a data frame. Raw data frame created by TwintNext, we can then save our tweets to a csv file.
Setup Your Raspberry Pi Quickly
       
The Pi userFirst of all, the default user is pi and the password raspberry and it is recommended you change them when you start using your device with maximum security. Note, if you have loaded Raspberry Pi OS onto a blank SD card, you will have two partitions. After this, the remote device will ask for the password, and then you are in. Copy a file from a remote Pi to localhost:$ scp pi@raspberrypi.local:file.txt /my/local/directory/Copy file.txt from localhost to a remote Pi:$ scp file.txt pi@raspberrypi.local:/remote/directory/Copy a file between two remote hosts:$ scp user@fromhostip:/from/dir/file.txt user@tohostip:/to/dir/Set audio input/outputIn the last part, I will go through the audio setup. Raspberry Pi creates virtual cards to handle data input and output devices.
Image De-noising Using Deep Learning
       
As a result, I have implemented several deep learning architectures that far surpass the traditional denoising filters. Problem Formulation Machine Learning Problem Formulation Source of Data Exploratory Data Analysis An Overview on Traditional Filters for Image Denoising Deep Learning Models for Image Denoising Results Comparison Deployment Future Work and Scope for Improvement References1. Problem FormulationTraditional image denoising algorithms always assume the noise to be homogeneous Gaussian distributed. 3 color channels of an image filled with 8-bit numbersNow consider the same thing for a noisy image. Source of DataAs this is a supervised learning problem, we need the pair of noisy images (x) and ground truth images (y).
Methods, Challenges, and Hazards of Collecting Tweets
       
I had built the project using tweets somebody else graciously collected and posted on Kaggle. All the code will be in Python but if that isn’t your language of choice, there are still lessons here for anyone interested in using tweets for their next project. This is another problem inherited from using Twitter’s search function and one that Twint’s devs aren’t keen on fixing due to the library’s focus on OSINT collection. I’m using “vaccine” here rather than the longer “Covid vaccine” to include users who used “corona” instead of “Covid” in their tweets. ConclusionI hope this article leaves you more knowledgeable about using Twitter data for your next project.
Deep Hashing for Similarity Search
       
Specifically from the different L2H methods, we will dive deep into Deep Hashing. Deep Hashing constitutes supervised L2H utilizing Deep Learning and comprises of different hashing methods, of which some of the novel ones concerning similarity search on image data along with their different functional design aspects are further elaborated upon,Deep Pairwise-Supervised Hashing (DPSH)Li et al. This method experimentally evaluates to outperform the deep hashing based onpairwise labels and pre-existing triplet label based deep hashing methods. Loss Function OptimizationDeep Hashing Network (DHN)Regarding an optimized loss function for pairwise supervised hashing, Zhu et al. [18] propose a novel deep hashing method Deep Supervised Discrete Hashing (DSDN) which uses both pairwise label and classification information to generate discrete hash codes in a one stream framework.
You Will Never Succeed If You Keep Applying for Jobs Online
       
How to Leverage Your NetworkIt sounds cliche, but there’s a reason they say, “Your network is your net worth.”So if we’re not applying online, what do we do? Let’s ask the data: 70%-85% of people report finding their job through networking. Take all the time you spent on applying online and start networking with peers in your industry. When I say network, it doesn’t mean you ask for some help right away. It will feel overwhelming when you start networking; at least to me, it was.
Thinking Fast and Slow and the Third Wave of AI
       
This is where Francesca Rossi’s work comes into play, with their paper called Thinking fast and slow in AI. 10 Questions for the AI Research CommunityHere, I will only list these 10 important questions and contextualization from the IBM paper “Thinking Fast and Slow in AI”. Thinking Fast And Slow in AI, Booch et al., (2020), https://arxiv.org/abs/2010.06002. Thinking Fast And Slow in AI, Booch et al., (2020), https://arxiv.org/abs/2010.06002. I definitely invite you to read the Thinking fast and slow in AI paper and Daniel Kahneman’s book “Thinking, Fast and Slow” if you’d like to have more information about this theory.
Basics of Time Series with Python
       
Basics of Time Series with PythonWorking functions and fundamentals of time series with pandasPhoto by Isaac Smith on UnsplashTime series analysis is a part of daily activities happening around us with respect to time. What we deal with in time series data? Datetime: It is used for basic functionalities for time and series in python. Calendar: The class calendar object contains many functionalities related to the calendar. index = pd.DatetimeIndex(['2020-1-20', '2020-02-01','2021-01-01','2021-02-01'])index #output:DatetimeIndex(['2020-01-20', '2020-02-01', '2021-01-01','2021-02-01'], dtype='datetime64[ns]', freq=None)Suppose we want to make another series and keep the index series as an index to another series.
How to Predict Stock Prices with LSTM
       
How to Predict Stock Prices with LSTMA Practical Example of Stock Prices Predictions with LSTM using Keras TensorFlowDisclaimer: I do not believe that any ML and AI model can predict the future price of the stocks. In this tutorial, I provide an example of how you can apply LSTM models for predicting stock prices. Sorry for not helping you to become rich :-)In a previous post, we explained how to predict stock prices using machine learning models. In the previous post, we have used the LSTM models for Natural Language Generation (NLG) models, like the word-based and the character-based NLG models. LSTM model for Stock PricesGet the DataWe will build an LSTM model to predict the hourly Stock Prices.
Creating AI Web Apps using TensorFlow, Google Cloud Platform, and Firebase
       
Prediction ServiceA simple way to deploy your trained model into a production environment is to create a prediction service using the GCP AI Platform. Creating a Bucket for your ProjectAfter creating your project, store your SavedModel in a bucket. Create a ModelAfter uploading the SavedModel into a project bucket, create a Model on the AI Platform. Using the left navigation menu, under the Artificial Intelligence section, navigate to AI Platform, then click on “Create Model”. After creating the model version, you would have successfully deployed your model to the cloud.
Step-by-step implementation of GANs on custom image data in PyTorch: Part 2
       
Step-by-step implementation of GANs on custom image data in PyTorch: Part 2Learn about the different layers that go into a GAN’s architecture, debug some common runtime errors, and develop in-depth intuition behind writing code in PyTorch. In case you would like to follow along, here is the Github Notebook containing the source code for training GANs using the PyTorch framework. Preparing the image datasetOne of the main reasons I started writing this article was because I wanted to try coding GANs on a custom image dataset. Preparing custom dataset classI know what you’re thinking — why do I need to create a special class for my dataset? In case you need further help creating the dataset class, do check out the PyTorch documentation here.
Genetic Algorithm for Trading Strategy Optimization in Python
       
Genetic Algorithm for Trading Strategy Optimization in PythonHow can GA help cut down problem space and converge towards a better solution? Incomes genetic algorithm (GA): a probabilistic & heuristic searching algorithm inspired by Darwin’s theory on natural selection that the fittest survive through generations. In this blog, we are going to use GA as an optimization algorithm for identifying the best set of parameters. Remember, this is only a demonstration of the application of GA for optimizing trading strategy and should not be copied nor followed blindly. If you would like to learn more about how to avoid overfitting Genetic Algorithm, here is a sequel to this blog that focuses on some techniques to a more robust Genetic Algorithm:
How AI Will End the One-Size-Fits-All Approach in Human Assessment
       
Surprisingly, when it comes to human assessment (e.g., tests, quizzes, and recruitment exams), we happen to be a big fan of the one-size-fits-all approach. Binet’s tailored approach to human assessment was absolutely innovative and groundbreaking, but a bit laborious. Thanks to modern technology, we are now able to automate Binet’s tailored assessment approach via computers. Computerized Adaptive TestingComputerized adaptive testing (CAT) is a form of computer-based assessment that follows the idea of tailored human assessment (or tailored testing). One of the best AI applications in human assessment is intelligent tutoring systems.
Tesseract OCR for Text Localisation and Detection
       
Tesseract OCR for Text Localisation and DetectionAutomating the extraction of useful information from PDF filesOptical character recognition (“OCR”) systems have been widely used to provide automated text entry into computerised systems. However, the inability of conventional OCR systems to read more than a handful of type fonts and page formats still remains unresolved. OCR systems transform a two-dimensional image of text, that could contain machine printed or handwritten text from its image representation into machine-readable text. Text detection is the process of localising where an image text is. The idea of text detection can be thought of as a specialised form of object detection.
Optical Character Recognition (OCR) for Text Localization, Detection, and More!
       
Please don’t forget that you can access this work, many more books, and other goodies by becoming a member. This work on reinforcement learning led by MineRL is fascinating. They are leading state-of-the-art work in the advancement and development of breakthrough RL methods for machine learning research. Check them out, especially if you are interested in Minecraft and reinforcement learning. If you haven’t checked it out, their 2021 trends report is very comprehensive.
Learn from the winner of the AWS DeepComposer Chartbusters Track or Treat challenge
       
AWS is excited to announce the winner of the AWS DeepComposer Chartbusters Track or Treat challenge, Greg Baker. The Track or Treat challenge, which ran in October 2020, challenged developers to create Halloween-themed compositions using the AWS DeepComposer Music studio. “It was so embarrassingly bad; I sort of buried it afterwards.”Greg happened to find AWS DeepComposer on the AWS Console through his work. Building in AWS DeepComposerIn Track or Treat, developers were challenged to compose music using spooky instruments in the AWS DeepComposer Music Studio. Check out the next AWS DeepComposer Chartbusters challenge, and start composing today.
Pooled ROC with XGBoost and Plotly
       
Dividing the training data into multiple training and validation sets is called cross validation. The ratio, size and number of sets depend on the cross-validation method and size of your training set. The most common is probably K-Fold, but depending on the size of the training set you might want to try Bootstrapping or Leave-One-Out. As this is specifically meant to show how to build a pooled ROC plot, I will not run a feature selection or optimise my parameters. Since we are using plotly to plot the results, the plot is interactive and could be visualised inside a streamlit app for example.
How to Build ML Model using BigQuery
       
Since the model training cannot handle string value as the output result, therefore it is necessary to code them into numbers. I choose logistic regression because it is the easiest to start with and also it is supported by BigQuery ML. Build the Model ?Build the Model — image by authorBuilding ML models in BigQuery split into training dataset and evaluation dataset. Training dataset: a subset of sample data used to create the model. Evaluation dataset: a subset of sample data used to assess the performance of the model.
Active sampling for pairwise comparisons
       
However, one of the drawbacks of pairwise comparisons is a large number of possible pairings. PipelineMost active sampling methods for pairwise comparisons follow a similar pipeline. An example pipeline for active sampling algorithms for pairwise comparisons. Data: we start from the collected data so far, these would be outcomes of pairwise comparisons. We are interested in the normally distributed score variable, given the outcomes of the pairwise comparisons collected so far.
Can foods help us fight COVID-19?
       
Can foods help us fight COVID-19? However, clues of a path for those struggling with COVID-19 in solitary might be found in foods or, to be more precise, in hyperfoods. Clues of a path for those struggling with COVID-19 in solitary might be found in foods or, to be more precise, in hyperfoods. Anti-COVID-19 molecules identified belong to a variety of chemical classes, belonging to this “dark matter”, including (iso)flavonoids, terpenoids, phenols and indoles. We then built a food map ranking foods based on the bioavailability and diversity of the predicted antiviral molecules.
Let’s Keep Explainable Methods Practical and Relevant
       
We therefore need methods that are easily applied on black-box models, models which we cannot modify nor know its internal components, while being intuitive and accessible to both the expert and to the layperson. In some cases, masking restores the model’s original prediction on the image. Our method specifically answers two questions:What parts of the image did the model most rely on to make its prediction? Our method’s objective is to create the smallest possible mask that changes the model’s original prediction on an image when given a perturbed image (that same image with the mask applied). Remember that the model is a black-box: its weights never change, and only the mask is updated!
How to Deliver a Successful Data Presentation
       
Offer a solution to a problemResearch problems senior leadership has mentioned around the presentation topic and try to find data insights that provide a solution. Most likely you’ll report lower conversion rates against control because anytime you raise prices conversion rates drop because people are less likely to pay more for the same product. The presentation to senior leadership should focus less on the decrease and more on the incremental sales that can result from increasing prices. Questions asked by stakeholders are often asked by senior leadership and this will give you an opportunity to research the answers before your actual presentation. Providing valuable actionable insights to senior leadership goes a long way for them to have a positive perception of you.
Go Ahead, Change My (AI) Mind
       
Go Ahead, Change My (AI) MindHave you ever tried to change someone’s mind? For all their flaws and biases, one thing that distinguishes AI systems from human decision systems is that it is comparatively easy to lay bare their biases and change their mind. AI systems, on the other hand, can handle being fed thousands or millions of test samples. And when we do find a bias in an AI system, we can change their mind on the spot. Some would argue that the bar here is not human decision systems, but rule-based systems which are also arguably consistent, controllable and auditable.
Using Data Science for Social Impact: Migrants and Their Fatal Routes
       
The present — an analysisA few weeks ago, I stumbled upon the Missing Migrants Project and the associated data and it refreshed my memory of the train incident. The Missing Migrants Project is trying to capture as much info as possible about migrants and it comes from myriad sources — a few of them are legit and rest of them might be underreporting for various purposes. There is a lot of missing data esp. I treated the missing values and took a deeper look at the data. As I have gathered hundreds of people travel for weeks in small dinghies at the mercy of the wind to reach the shore.
Hugging Face Transformers: Fine-tuning DistilBERT for Binary Classification Tasks
       
Hugging Face Transformers: Fine-tuning DistilBERT for Binary Classification TasksHugging Face and TensorFlow 2.0. (Note: Make sure to split your data beforehand and only oversample the training set to ensure your evaluation results remain as unbiased as possible!) Thus, including the attention mask as an input to your model may improve model performance. Thus, including the attention mask as an input to your model may improve model performance. 3.3) Training Classification Layer WeightsOk, we’ve finally built up our model, so we can now begin to train the classification layer’s randomly initialized weights until model performance converges.
NLP for Supervised Learning — A Brief Survey
       
NLP for Supervised Learning — A Brief SurveyPhoto by Piotr Łaskawski on UnsplashIt’s hard to keep up with the rapid progress of natural language processing (NLP). FastText open-sourced its code as well as the multiple-language word embeddings trained with it. Improving word embeddings with context (2018)In traditional word embeddings (e.g., Word2vec, GloVe), each token has only one representation (i.e., embedding). Then, the ELMo word representation (i.e., vector) is concatenated with the token vector to enhance the word representation in the downstream task (e.g., classification). The decoder stack is similar but includes an additional attention layer to learn attention over the encoder’s output.
Python Can Be Faster Than C++
       
Python Can Be Faster Than C++Photo by Jake Givens on UnsplashPython is a great versatile programming language. Even though python is used most for machine learning problem solving because of its library and high-level language, it is known to be slower than many other languages. Because of its reputation, many would decide to leave the language behind and stick with other options like C++ for program solving. In this article, I will show you how Python is faster than C++. Basic Speed TestingTo test the normal speed difference between Python and C++, I will test the execution time of the Generation of primes algorithm.
Flutter Failed To Solve the Biggest Challenge for Our Cross-Platform App
       
When the decision was made to finally create an Android version of the app, the obvious question was how. After some research into the current state of multiplatform app development, there seemed to be quite a lot of hype about the latter. Once it is loaded and rendered in a WebView, I replace the spinner with the rendered content. As I could not ship my app in that state, I started to look for alternative approaches. Turns out I found another bug, this time in the Flutter WebView package, that I filed and that also was quickly acknowledged, reproduced, and triaged by the team.
I’ll Never Parent the Same Way After This Pandemic Is Over
       
It was the up-close look at the intense anxiety that spools its way around every minutia of every moment in a day. In the days before Covid, I spent my days trying (and usually failing) to keep my house orderly. I’ll still toss it in the wash and dryer, but I forced my family to step up and fold and put away their clothes. Self-care is refusing to let the chores in this house determine or measure the quality of my ability to parent. I am learning to be honest with my husband about how much I hate being the default parent.
U.S. House of Representatives Files Replication to Former President Trump’s Answer to the Article of Impeachment
       
U.S. House of Representatives Files Replication to Former President Trump’s Answer to the Article of ImpeachmentWashington, DC — Today, the House Impeachment Managers, on behalf of the U.S. House of Representatives, filed with the Secretary of the Senate a Replication to the Response to the Summons of former President Donald J. Trump to the Article of Impeachment. In the Replication, the House Managers write:“The evidence of President Trump’s conduct is overwhelming. As charged in the Article of Impeachment, President Trump violated his Oath of Office and betrayed the American people. He will be joined by Congresswoman Diana DeGette, Congressman David Cicilline, Congressman Joaquin Castro, Congressman Eric Swalwell, Congressman Ted Lieu, Congresswoman Stacey Plaskett, Congresswoman Madeleine Dean, and Congressman Joe Neguse. Click here to read the Replication to the Response to the Summons of former President Donald J. Trump to the Article of Impeachment.
Avoid Painkillers Before and After Covid Vaccine, Experts Say
       
Avoid Painkillers Before and After Covid Vaccine, Experts SayPhoto: James Yarema/UnsplashExperts say that people should avoid painkillers before and after getting a Covid-19 vaccine, to give it the best chance of doing what it’s supposed to do: stimulate your immune system. The possibility has not been studied with Covid vaccines, but Mina and other experts say it’s possible, so they advise skipping painkillers if you can bear it. And afterward, “try very hard not to.”Know your drugsIf you must take something for aches or fever after a Covid shot, all three major classes of over-the-counter painkillers can be effective, Mina and other experts say. If the shot itself causes pain, try covering it with a cool, wet cloth — or exercise your arm, the CDC suggests. Most people who do catch the disease by another means can recover at home, and painkillers are among the few helpful remedies available without a prescription for mild Covid symptoms.
Inslee signs bipartisan bill to support business and workers
       
Jay Inslee today signed legislation providing relief for businesses and workers impacted by the COVID-19 pandemic. Jay Inslee signs SB 5061. At a time when revenue is down and employers are facing increased costs of business, this bill offers much needed relief. Additionally, SB 5061 makes policy updates to ensure that Washington’s unemployment insurance system is more nimble and responsive during public health emergencies. SB 5061 provides a bridge for those who need it most,” said Keiser.
Function Optimization With SciPy
       
The local search optimization algorithms available in SciPy. Tutorial OverviewThis tutorial is divided into three parts; they are:Optimization With SciPy Local Search With SciPy Global Search With SciPyOptimization With SciPyThe Python SciPy open-source library for scientific computing provides a suite of optimization techniques. They are:Scalar Optimization : Optimization of a convex single variable function. # minimize an objective function result = minimize ( objective , point )Additional information about the objective function can be provided if known, such as the bounds on the input variables, a function for computing the first derivative of the function (gradient or Jacobian matrix), a function for computing the second derivative of the function (Hessian matrix), and any constraints on the inputs. They are:The library also provides the shgo() function for sequence optimization and the brute() for grid search optimization.
Estimating Time until Contract Termination — Survival Analysis with Lifelines
       
This is why we have the right-censored concept. To clarify, we would use a dataset example from Kaggle regarding the Employment Attrition where their contract is terminated. As we are interested in the termination time, we need the employee data with their contract status (ACTIVE or TERMINATED). This is the time in months of the employee serving within the company until now or until their contract is terminated. This is because Survival Analysis is developed to deal with the estimation using the right-censored data.
3 Areas Missing From Data Science Courses You Should Know
       
However insufficient, as there are three important areas that you won’t typically find in data science courses or in some university degrees. Here I’m not talking about clean data science tutorials or research problems but typical real-world business data science problems most organisations have. What is data preparation, well it could be anything from scraping, cleaning, normalising, joining, filling missing data, vectorizing, pivoting, labelling etc. Especially if you are not from a computer science background your will have limited exposure to development or software engineering, if you have only directly done data science courses. To learn more about my views have a read of my other post like Recommendations for Working in Data Science, AI and Big Data Based on my Personal Experience.
K-Nearest Neighbors (K-NN) Explained
       
K-NN (k=1) vs. K-NN (k=9) Classifiers (all images are generated by the author)K-Nearest Neighbors (K-NN) ExplainedI have decided to create a Python library implementing popular machine learning algorithms using only NumPy and SciPy. Given how busy I am at the moment, I took on a relatively easy algorithm for my next step; K-Nearest Neighbors (K-NN). You simply find the k nearest data points and use their labels to predict the labels of new data points. The K-NN AlgorithmThe K-NN algorithm is very simple and the first five steps are the same for both classification and regression. K-NN has issues with extrapolating beyond the range of values in the training data, although this is not unique to K-NN.
NLP for Supervised Learning — A Brief Survey
       
NLP for Supervised Learning — A Brief SurveyPhoto by Piotr Łaskawski on UnsplashIt’s hard to keep up with the rapid progress of natural language processing (NLP). The decoder stack is similar but includes an additional attention layer to learn attention over the encoder’s output. Discriminative fine-tuning is applied, where each layer is fine-tuned with different learning rates — the last layer has the highest learning rate, with each subsequent layer having reduced learning rates. One problem BERT had was not being able to learn bidirectionally (which is why BERT used the cloze task for pre-training). XLNet (June 2019) addresses this via permutation language modelling (LM); in contrast, BERT used masked language modelling.
Predicting Home Prices: Using Regression with Categorical Factors
       
Predicting Home Prices: Using Regression with Categorical FactorsIntroduction to Regression with CategoricalsRegression is a staple in the world of data science, and as such it’s useful to understand it in its simplest form. Facet just means that rather than creating a single histogram, we actually have a histogram for each level of a given categorical variable. Let’s Build A Regression ModelWhen building a regression model, it’s important to understand what exactly is going on under the hood. Group MeansWe’ll kick this off, looking at the mean value per each value of the waterfront group. I hope this primer in using categorical variables for regression proves useful as you leverage these and other tools to conduct analysis.
Building An Anomaly Detection System? 5 Requirements It Must Meet
       
Building An Anomaly Detection System? It is definitely possible to build your own anomaly detection system — but to make it successful, you need to meet these five requirements. (Not yet familiar with Anomaly Detection? And there is no point in having an anomaly detection system that they can’t interpret. To Sum UpGetting a production-ready anomaly detection system working well for business users is more than just setting an algorithm loose on your data sets.
What I Learned Setting up Storage for a Machine Learning Project
       
What I Learned Setting up Storage for a Machine Learning ProjectPhoto by Vasily Koloda on UnsplashIt is amazing how far machine learning frameworks and technology, in general, have come and how fast we’re nowadays able to integrate machine learning features into applications. I used Python to read and parse the JSON files into an array and inserted the array at once into the MongoDB collection. I went with option one at first: A collection for all tweets and another collection for each tweet’s classification. It took 20s seconds to find tweets in the tweets collection but not in the classifications collection. I had implemented a scalable storage solution for my machine learning project and a lightweight and fast classification process.
Practical NoSQL Guide with MongoDB
       
Practical NoSQL Guide with MongoDBPhoto by Brett Jordan on UnsplashNoSQL refers to non-SQL or non-relational database design. The common structures adapted by NoSQL databases to store data are key-value pairs, wide column, graph, or document. One of the popular ones is MongoDB which stores data as documents. If you’d like read more about NoSQL and how to set up MongoDB in your computer, here is an introductory guide you can refer to. The examples will query a collection called “marketing” which stores data about a marketing campaign of a retail business.
Implementing Single Shot Detector (SSD) in Keras: Part II — Loss Functions
       
Implementing Single Shot Detector (SSD) in Keras: Part II — Loss FunctionsThis article is part of a bigger series called Implementing Single Shot Detector (SSD) in Keras. According to Girshick (2015):Smooth L1 loss is a robust L1 loss that is less sensitive to outliers than the L2 loss used in R-CNN and SPPnet. To construct the overall loss function, we first need to code the loss functions for Softmax Loss and Smooth L1 Loss. The below code snippet demonstrates the SSD loss function in Keras:ConclusionThis article demonstrates the concepts and code needed to implement the SSD loss function in Keras. Understanding Categorical Cross-Entropy Loss, Binary Cross-Entropy Loss, Softmax Loss, Logistic Loss, Focal Loss and all those confusing names.
Helping African Farmers Increase Their Yields Using Deep Learning
       
Helping African Farmers Increase Their Yields Using Deep LearningA healthy Cassava plant, photo by malmanxx on UnsplashIf you are like me, then most of the time you can’t remember what you dreamed. I dreamed I got an email that announced I had won the Cassava Leaf Disease competition. My place on the public leaderboard of the Cassava Leaf Disease competition was 2343. So far for dreams coming true…I just started with the Cassava Leaf Disease competition after I passed the Developer Certification exam. A solution using image recognition with Deep Learning would be the ideal candidate.
5 Recommended Articles For Data Scientists (Feb 8)
       
IntroductionThe vast topics and disciplines associated with the Data Science field, and more broadly Machine learning field, allows for technical writers on Medium to cover interesting subjects within articles. Below are five Data Science focused article that shouldn’t go unread. The topics covered by the presented articles range from cybersecurity to data science project management. Note: Feel free to use the comment section to share any ML and DS article you've come across this week that is worth sharing. Happy Reading.
Using Deep Learning in the Fight Against COVID-19
       
When ER patients show breathing problems, a CT scan is often the first prognosis tool a doctor will use to suggest further treatment. To predict Covid-19, previous work and the dataset established in A large dataset of real patients CT scans for SARS-CoV-2 identification (Soares et al. This contains around 1300 CT scans of healthy lungs, and around 1300 CT scans of Covid infected lungs from 250 COVID negative patients and 247 COVID patients. Autoencoders are a type of neural network used to learn a representation (encoding) for a set of data. Covid-19 AutoencoderThe results of this Autoencoder are pretty conclusive — Here is a sample of 4 CT scans and their corresponding outputs.
An empirical approach to speedup your BERT inference with ONNX/Torchscript
       
Inference takes a relatively long time compared to more modest models and it may be too slow to achieve the throughput you need. We will focus on this approach here to see the impact it can have on the throughput of our model. First we’ll take a quick look at how to export a Pytorch model to the relevant format/framework, if you don’t want to read code you can skip to the results section further down. Inference time scales up roughly linearly with sequence length for larger batches but not for individual samples. Next stepsWhile these experiments have been run directly in Python, both Torchscript and ONNX models can be loaded directly in C++, this could provide an additional boost in inference speed.
7 signs you’re rich, even if it doesn’t feel like it
       
Maybe it means being able to buy a flashy mansion or spend your life flitting from luxury vacation to luxury vacation. The self-made millionaire stars of “West Texas Investor’s Club” say their relationships are more valuable than the money they earn. You will eventually be able to pay for the things you really wantIf you can go out and buy a yacht in cash today, most people would agree that you’re rich. You’re probably still rich. If you have the luxury to focus on something other than the money, you’re in a good place.
6 visual design fundamentals that UX designers should consider
       
When I was studying visual communications design in college, I was fascinated by how much power designers have in making people think, feel, and act a certain way. We’ve all heard of the visual design elements like shape, line, etc., and we’ve had the principles like contrast, emphasis, rhythm, etc. For example, want to make an object feel like it’s peeping in? Often visual tension is just unintentionality — designers not realizing that they accidentally put shapes adjacent to each other that create tension unknowingly. You can, if you’re intentional, use visual tension to draw a user’s eye and create emphasis.
Apple: The Hyundai/Kia Gift
       
by Jean-Louis GasséeThe Apple Car speculation continues. The January 23rd Monday Note wondered: Sure, the first Apple Car (a.k.a. That said, I now need to justify this Note’s title: The Hyundai-Kia Gift, begging for your patience because you might see how it is relevant to the Apple Car challenges. […] Tech firms like Google and Apple want us to be like Foxconn…”Here is the gift: Hyundai-Kia execs tell Apple how the relationship will work…before the marriage contract is inked. In the meantime, the form and purpose of the putative Apple Car remains a mystery.
‘Fuck Your Feelings’ Never Applies to White Men
       
The evidence of the past months suggests there are a great many men, particularly white men, who are fatally unable to distinguish facts from feelings. The notion that anyone might have a political priority more pressing than mollifying white male feelings is still shocking. Most of us, after all, have been raised to pay attention to the emotions of white men, to attempt to manage their moods because not doing so might be dangerous. The notion that anyone might have a political priority more pressing than mollifying white male feelings is still shocking. In a culture held hostage by the emotions of white men, this apparently matters more than objective truth.
Africa is Planting Tens of Millions of Trees in the Desert. Here’s Why.
       
The Tree of Ténéré, Niger (1961)In the dusty, windswept lands of Niger, there once stood a lonely acacia tree. Known as the Tree of Ténéré it was the only one for hundreds of miles — the loneliest tree on Earth. Already farmers and pastoralists abandon their ancestral lands, as millions of Ethiopians are pushed into food insecurity by drought. It’s already upon us.’ Ever a man of action, Sankara proposed the creation of a park of 10 million trees. Even worse as many as 80 per cent of the planted trees had likely died.
Explainable Neural Networks: Recent Advancements
       
Explainable Neural Networks: Recent Advancements, Part 1Rise of Neural NetworksNeural Networks have been outperforming many other machine learning models like SVM, adaboost, gradient boosted trees etc., in a variety of tasks like image recognition, visual question answering, sentence completion etc., establishing itself as the the new state of the art in many machine learning problems. Multiple researches over the years have proved the effectiveness of neural networks in learning implicit patterns in data. Despite this limitation, what makes neural networks so good at recognizing images when they are trained on a large collection of labelled images? Parallel to the developments in finding better performing, easier to train deep neural networks in the decade 2010–2020, more and more researchers have also been interested in developing insights into what a neural network learns and how it makes it decisions. Since then, most works on explainable neural networks has employed some version of using gradients or defining gradient-like candidates which can be back propagated down to the inputs for understanding the decisions of the network.
10 NLP Terms Every Data Scientist Should Know
       
My goal in writing this article is to have a one-stop article for anyone curious about natural language processing to know the meaning of the commonly used terms in the field. Because once you do, you will be able to read any article or watch any video about natural language processing with minimal confusion. Lexicons are essential for having more accurate results out of your natural language processing models. №9: Named Entity Recognition (NER)In any natural language processing task, we are often asked to read, clean, and analyze a huge corpus. This article presented you with the basic terminologies of natural language processing that you will find in most articles and videos describing any field aspect.
Designing Custom 2D and 3D CNNs in PyTorch
       
Image DimensionsA 2D CNN can be applied to a 2D grayscale or 2D color image. Defining a 2D CNN Layer in PyTorchIn PyTorch the function for defining a 2D convolutional layer is nn.Conv2d. The number of out_channels of one CNN layer will become the number of in_channels of the next CNN layer. 2D CNN Sketch with CodeHere is a sketch of a 2D CNN:2D CNN sketch. Defining a 3D CNN Layer in PyTorchThe function to define a 3D CNN layer in PyTorch is nn.Conv3d.
How to Write A Data Science Resume
       
Automated Resume Screening: Once you submit your resume, it is parsed by an application. Hiring Manager: The hiring manager is looking for indicators that you can be successful in the job. Project work follows the same format and content as work experience. Active Versus Passive Job HuntingIf you are applying directly to a job, customize your resume to speak to the hiring manager. Some resume discovery tools use resume age as an indication of how accurate the resume is or how likely you are to be open to a new role.
Check For a Substring in a Pandas DataFrame Column
       
PYTHONCheck For a Substring in a Pandas DataFrame ColumnPhoto by Markus Winkler on UnsplashThe Pandas library is a comprehensive tool not only for crunching numbers but also for working with text data. Using “contains” to Find a Substring in a Pandas DataFrameThe contains method in Pandas allows you to search a column for a specific substring. The contains method returns boolean values for the Series with True for if the original Series value contains the substring and False if not. We’ve simply used the contains method to acquire True and False values based on whether the “Name” column includes our substring and then returned only the True values. Using regex with the “contains” method in PandasIn addition to just matching on a regular substring, we can also use contains to match on regular expressions.
Improve the train-test split with the hashing function
       
While preparing the data for training and evaluation, we normally split the data using a function such as Scikit-Learn’s train_test_split . I applied the 80–20 train-test split using a random_state to ensure the results are reproducible. The function follows the logic described in the paragraph above, where 2³² is the maximum value of this hashing function. Before testing the function in practice, it is really important to mention that you should use a unique and immutable identifier for the hashing function. ConclusionsIn this article, I showed how to use hashing functions to improve the default behavior of training-test split.
Land Your Dream Data Science Job: Signs of an Amazing Candidate
       
“Looking for a data scientist that’s a solid average.”Have you ever heard someone say that? The CEO is constantly asking you about new ways to insert data science to boost the bottom line. The new way is better because____.”Don’t be surprised if you’re tapped to lead more and more data science projects, and get more resources and freedom. Data science interview processes often involve a project component where candidates work through an analysis problem. What could improve both our lives as data scientists, and the lives of our customers who benefit from optimized experience?
You Should Master Data Analytics First Before Becoming a Data Scientist
       
However, the best way is to have some sort of Data Analytics practiced with other people as you will see below when I discuss the top four benefits of mastering Data Analytics before learning Data Science. As you specialize in Data Analytics, it is no surprise that you would become efficient at exploring data. Working as a Data Analyst beforehand often requires plenty of collaboration more often than that of a Data Scientist. SummarySo the question is, should you become a Data Analyst first before becoming a Data Scientist? Please feel free to comment down below if you have become a Data Analyst first in some way before becoming a Data Scientist.
Demi Moore’s Face Drama Is Nothing New
       
Demi Moore’s Face Drama Is Nothing NewBy David Shankbone — Shankbone, CC BY 3.0, https://commons.wikimedia.org/w/index.php?curid=11705580Let’s talk about aging in Hollywood. They’re damned if they get plastic surgery and damned if they don’t. Here is what happened with Demi Moore’s face. But if we can tell that a female celebrity has had plastic surgery, well, then they “messed up their face.”Demi Moore is not the first to be dragged for her alleged plastic surgery. If you’re aging, congrats.
The 17/20 nutrition principle is the simple approach a Hollywood trainer uses to get A-listers in shape
       
By Rachel HosieMagnus Lygdbäck is the personal trainer and nutritionist responsible for some of Hollywood’s biggest stars’ physiques. As well as putting A-listers through their paces in the gym, Lygdbäck ensures his clients are eating right for their goals, and his approach to nutrition is unique and refreshingly simple. It’s called the 17/20 system, and involves no calorie tracking, no forbidden foods, and no extreme restriction. But Lygdbäck thinks an easier approach is to keep portion size in check using your hands as a guide. It’s there to provide you with structure without forcing you to eat certain things or take out foods,” Lygdbäck said.
Unleashing Sum Types in Pure C99
       
Hopefully, our problem has been solved a long time ago; the solution is called sum types. Significantly safer manipulation with sum types: normally we cannot access a block data when we match a small value and vice versa. It features a clean, type-safe interface, as well as formally defined code generation semantics, thus allowing to write an FFI for libraries exposing sum types in their interface. Sum types are also highly applicable in compiler/interpreter development. The next article will be dedicated to zero-cost, convenient error handling using sum types.
Bitcoin Mining With 12 Lines of Code in Python
       
Bitcoin Mining With 12 Lines of Code in PythonPhoto by André François McKenzie on UnsplashBitcoin has become one of the hottest trends in recent years. Let’s understand this with an example. The ledger entry will be — You are paying the vegetable owner 15 dollars. The ledger entry will be — The vegetable owner is paying 10 dollars to the doctor. Then we will learn how to do bitcoin mining in python.
Creating Automated Python Dashboards using Plotly, Datapane, and GitHub Actions
       
But in the real world, analyzing data, learning how to ‘automate the boring stuff,’ creating ETL pipelines and dashboards, and guiding stakeholders or clients to the correct path often delivers much more value. In the real world, the problems data scientists face aren’t only how to tweak ML models and compare metrics, like they would in Kaggle competitions. The more you actively participate in the decision-making process, the more you learn about how to test hypotheses and solve “real” problems, not textbook ones. In this article, we will be using Python to pull the stock market prices live using pandas datareader and create an interactive report using Plotly and Datapane. We will then use GitHub actions to trigger our code to run every day to update our report.
The Best Way to Listen to Tidal Is Not With Expensive Hifi Equipment
       
The Best Way to Listen to Tidal Is Not With Expensive Hifi Equipment #hope Follow Feb 6 · 6 min readThere is a myth out there that which has to do with streaming and Hifi. The problem with all of the Hifi music streamers out there today; their Achilles heel you could say, is that they do not support Tidal Offline. I want you to really, really understand this fully. No matter what they say, the only devices which support Tidal Offline are iOS and Android phones and tablets. The $1000+ you are spending on the Hifi music streamer is completely wasted and it is effectively a lie.
‘Malcolm & Marie’ Isn’t About a Relationship, It’s About Abuse
       
‘Malcolm & Marie’ Isn’t About a Relationship, It’s About AbusePhoto: NetflixI thought I knew about Malcolm & Marie when I sat down to watch it. After that moment, Malcolm & Marie never returns to a safe space. Malcolm & Marie is reminiscent of that trauma, and I’m sure it’s triggering for people who have survived abuse, manipulation, and gaslighting in relationships. Malcolm & Marie is a good movie in the way that movies can be objectively good. Instead, Malcolm & Marie is about devastation.
Google AI Blog: TracIn — A Simple Method to Estimate Training Data Influence
       
The quality of a machine learning (ML) model’s training data can have a significant impact on its performance. In “Estimating Training Data Influence by Tracing Gradient Descent”, published as a spotlight paper at NeurIPS 2020, we proposed TracIn, a simple scalable approach to tackle this challenge. Suppose that the test example is known at training time and that the training process visited each training example one at a time. During the training, visiting a specific training example would change the model’s parameters, and that change would then modify the prediction/loss on the test example. ConclusionTracIn is a simple, easy-to-implement, scalable way to compute the influence of training data examples on individual predictions or to find rare and mislabeled training examples.
Anomaly detection with Amazon Lookout for Metrics
       
In this post, we walk you through a retail example to demonstrate how Amazon Lookout for Metrics, a new ML-based AWS service for anomaly detection, helps accelerate detection and remediation of anomalies. Anomaly detection in retail transactional dataLet’s dive into an anomaly detection use case we worked on for an online retailer. For them, quick and accurate anomaly detection in KPIs is imperative for timely remediation, in order to maintain inventory flow and price compliance. Comparing Amazon Lookout for Metrics to traditional anomaly detection methodsOne of the key benefits of using Lookout for Metrics is a reduction in setup time from days to hours. By leveraging a managed service like Lookout for Metrics, anomaly detection is simplified and automated, saving time and effort.
Using genetic algorithms on AWS for optimization problems
       
GAs use concepts from evolution such as survival of the fittest, genetic crossover, and genetic mutation to solve problems. After a number of generations of evolution, the best solution found across all the generations is chosen as the final problem solution. When the algorithm exits the main loop, the best solution found during that run is used as the problem’s final solution. After you create the IAM role and DynamoDB tables, we’re ready to set up the GA code and run it using SageMaker. The same applies to a tournament of size 3 or more—you simply use the one with the best fitness score.
Amazon DevOps Guru is powered by pre-trained ML models that encode operational excellence
       
Amazon DevOps Guru is a turn-key solution that helps operators by automatically ingesting operational data for analysis and predictively identifying operationally relevant issues. Amazon DevOps Guru automatically identifies most common behaviors of applications that correspond to operational incidents. In this post, we shed light on some of the ML approaches that power DevOps Guru. DevOps Guru detectorsAt the core of Amazon DevOps Guru is a unique approach to identify meaningful operational incidents. DevOps Guru insightsInstead of providing just a list of anomalies that an ensemble of detectors find, DevOps Guru generates operational insights that aggregate relevant information needed to investigate and remediate an operational issue.
The #1 Data Science Case Interview Question You Should Be Ready For
       
The #1 Data Science Case Interview Question You Should Be Ready ForCase interviews are a helpful way to evaluate a potential hire’s hard and soft skills. The question may seem simple at first, but it contains a lot of the meat of any data science solution. — because you have to describe what uniquely identifies a row in the data set, how you are encoding categoricals/numericals. — because you will have to describe what columns will go into the data set as features and why those features should be predictive. Then, run through the exercise of describing your data set and think about all the implications of the decisions you are making along the way.
What You Should Know Before Starting a Career in Data Analytics
       
Data is cleanWhen you first take classes in data analytics you’re provided with clean data for assignments and don’t have to account for outliers, missing, or bad data. The company can have no data engineers to help build ETL pipelines to automate loading data into the database. Being in a data analytics role requires you to interact with many people. You only have one managerI bet no one told you that data analytics is also a customer service job. Data analytics is in high demand and finding a job shouldn’t be hardIt’s difficult to find a data analytics job without any experience.
How to get that first bit of data science experience on your resume
       
How to get that first bit of data science experience on your resumeI have done a fair amount of teaching and mentoring for data science programs of various sorts, and one of the biggest frustrations my students run quickly run into when they finish is the classic catch-22: to get that first job, you need experience, but the only thing that counts as experience is a job. Internships can be a great way to get that first bit of experience. However, they can be pretty competitive and not much easier as a first step into the data science world. Moreover, often companies are looking for students either still in a degree program or who have just finished, which means this can be hard to come by for those who are entering data science as a later-career move (the majority of my students). Putting together a full end-to-end live projectIt’s one thing to put together a simple learning project that uses some data off of Kaggle, and upload it to Github.
Altair: Statistical Visualization Library for Python (Part 4)
       
Customizing visualizationsPhoto by Isaac Smith on UnsplashAltair is a statistical visualization library for Python. Altair is highly flexible in terms of data transformations. We will be using an insurance dataset that is available on Kaggle. import numpy as npimport pandas as pdimport altair as alt insurance = pd.read_csv("/content/insurance.csv")insurance.head()(image by author)The dataset contains some measures (i.e. Let’s also change the size of the visualization using the properties function.
Tips for Running High-Fidelity Deep Reinforcement Learning Experiments
       
These phenomena make running high-fidelity, scientifically-rigorous reinforcement learning experiments paramount. In this article, I will discuss a few tips and lessons I’ve learned to mitigate the effects of these difficulties in DRL — tips I never would have learned from a reinforcement learning class. Many reinforcement learning algorithms have some degree of randomness/stochasticity built-in, for instance:How your neural networks are initialized. Well, one way is to try running the reinforcement learning algorithm without this component, using an ablation study. In this article, we talked about the importance of running high-fidelity, scientifically-rigorous experiments in deep reinforcement learning, and some methods through which we can achieve this.
Beyond Accuracy: Behavioral Testing of NLP Models with CheckList
       
For example checking for negation of negative sentiment —Sentence: The food is not poor. If it does so, beyond certain user defined threshold then the test case will be treated as model’s failure. If it does so, then the test case will be treated as model’s failure. They chose Microsoft’s, Google’s and Amazon’s paid NLP APIs as the choice for commercial models and BERT, RoBERTa as the choice for research models. Research models like RoBERTa, yet had a decent performance, if not perfect.
Creating ONNX from scratch
       
In these cases users often simply save a model to ONNX format, without worrying about the resulting ONNX graph. In this tutorial we will show how to use the onnx.helper tools in Python to create a ONNX pipeline from scratch and deploy it efficiently. Creating the ONNX pipeline. Creating the ONNX pipeline. When viewed using Netron, our resulting ONNX pipeline looks like this:The full data processing pipeline created in ONNX.
iPad Pro + Raspberry Pi for Data Science Part 2: Setting Up the Hardline Connection!
       
Raspberry Pi + iPadiPad Pro + Raspberry Pi for Data Science Part 2: Setting Up the Hardline Connection! I purposefully wiped my own Raspberry Pi and re-created all the steps here just to make sure you don’t have any issues here. If all went well, you should now be able to plug the Raspberry Pi directly into your iPad and have the Raspberry Pi recognized as an ethernet device in the settings. If you made it this far, you’re ready to start using your Raspberry Pi directly with your iPad Pro. You should now be able to directly interface with your Raspberry Pi from your iPad Pro with a hardline connection!
The most powerful passports in the world in 2021, ranked
       
Those holding the following passports can visit 184 places without a visa in 2021: Australia, Czech Republic, Greece, and Malta. Those holding the following passports can visit 186 places without a visa in 2021: France, Ireland, Netherlands, Portugal, and Sweden. Those holding the following passports can visit 187 places without a visa in 2021: Austria, Denmark. Those holding the following passports can visit 188 places without a visa in 2021: Finland, Italy, Luxembourg, and Spain. Those holding passports from Germany and South Korea can visit 189 places without a visa in 2021.
Stop Kidding Yourself: You Know They’re Not That into You
       
Stop Kidding Yourself: You Know They’re Not That into YouPhoto by Erik Mclean on UnsplashLast year, I met a man. I told him I thought he was great, but we should stop seeing each other because I didn’t think we were a good match. Do you think we’ll see each other again?”He texted back, “Of course we’re seeing each other again.”And then he disappeared. But if you get to the point of having to ask someone if they’re still interested, well, that’s a pretty big sign they’re probably not. If that’s not what you have, stop kidding yourself, they’re just not that into you.
A Few Lessons from Vinod Khosla
       
Reflecting on my last day at Khosla VenturesAfter 3.5 years, today is my last day at Khosla Ventures. It would not be possible to summarize all my learnings in one post, but I would like to first say, thank you Vinod Khosla and the entire partnership for taking a chance on me. While it’s hard to summarize this in a post, the lessons I’ve learned from Vinod and Khosla Ventures could fill a library. Vinod taught me you can’t do everything, and you have to make the hard choices on what to do. Vinod taught me to always take the risk to do the right thing and be brutally honest.
How hackers are finding creative ways to steal gift cards using artificial intelligence.
       
However, the one that caught my eye was an online store selling Starbucks gift cards. I was genuinely curious how someone could obtain stolen gift cards for different stores. It was fascinating because the seller was selling these gift cards without a PIN attached. I was met by a prompt dm telling me I could check them using the store’s gift card phone line. A minuscule amount compared to what you could make stealing people’s gift cards from the perspective of the black hat.
‘We traced a phone inside the Capitol to Mr. Vincent’s home in Kentucky.’
       
In a new piece for The New York Times, writers Charlie Warzel and Stuart A. Thompson detail—and not for the first time—how our smartphones feed a so-called “surveillance economy” that annihilates personal privacy in real and unexpected ways. Warzel and Thompson obtained a file from an unnamed source containing location data tied to “thousands of Trump supporters, rioters, and passers-by in Washington, D.C.” on the date of the insurrection at the Capitol. This data was generated by smartphone apps for the sake of digital advertising, and although it is supposed to be anonymized, Warzel and Thompson demonstrate how easily it can be pinned to individuals. For example, they were able to use the data to track pest exterminator Ronnie Vincent’s path from Kentucky to the Capitol. “ ‘There is no way that my phone shows me in there,’ he said.
Gradient Descent With Momentum from Scratch
       
Tutorial OverviewThis tutorial is divided into three parts; they are:Gradient Descent Momentum Gradient Descent With Momentum One-Dimensional Test Problem Gradient Descent Optimization Visualization of Gradient Descent Optimization Gradient Descent Optimization With Momentum Visualization of Gradient Descent Optimization With MomentumGradient DescentGradient descent is an optimization algorithm. MomentumMomentum is an extension to the gradient descent optimization algorithm, often referred to as gradient descent with momentum. Gradient Descent OptimizationNext, we can apply the gradient descent algorithm to the problem. ... # define momentum momentum = 0.3 # perform the gradient descent search with momentum best, score = gradient_descent(objective, derivative, bounds, n_iter, step_size, momentum) 1 2 3 4 5 . # define momentum momentum = 0.3 # perform the gradient descent search with momentum best , score = gradient_descent ( objective , derivative , bounds , n_iter , step_size , momentum )Tying this together, the complete example of gradient descent optimization with momentum is listed below.
Google AI Blog: Machine Learning for Computer Architecture
       
This requires the evaluation of many different accelerator design points, each of which may not only improve the compute power, but also unravel a new capability. However, the manifold of architecture search generally contains many points for which there is no feasible mapping from software to hardware. The following figure shows the overall architecture search space of a target ML accelerator. Optimization StrategiesIn this study, we explored four optimization strategies in the context of architecture exploration:Random: Samples the architecture search space uniformly at random. The star and circular markers show the infeasible (zero reward) and feasible design points, respectively, with the size of the feasible points corresponding to their reward.
Semantic Segmentation with GANs for Self-Driving Cars
       
Semantic Segmentation with GANs for Self-Driving CarsBecker, Dan. How are GANs suited to perform semantic segmentation? This makes them suitable for the task of semantic segmentation on street-view images. Constructing the GAN:from numpy import expand_dimsfrom numpy import zerosfrom numpy import onesfrom numpy import vstackfrom numpy.random import randnfrom numpy.random import randintfrom keras.utils import plot_modelfrom keras.models import Modelfrom keras.layers import Inputfrom keras.layers import Densefrom keras.layers import Flattenfrom keras.layers.convolutional import Conv2D,Conv2DTransposefrom keras.layers.pooling import MaxPooling2Dfrom keras.layers.merge import concatenatefrom keras.initializers import RandomNormalfrom keras.layers import LeakyReLUfrom keras.layers import BatchNormalizationfrom keras.layers import Activation,Reshapefrom keras.optimizers import Adamfrom keras.models import Sequentialfrom keras.layers import Dropoutfrom IPython.display import clear_outputfrom keras.layers import ConcatenateAs I am using the keras framework to construct the generator and the discriminator, I need to import all the necessary layer types to construct the model. def define_discriminator(image_shape=(256,256,3)):init = RandomNormal(stddev=0.02)in_src_image = Input(shape=image_shape)in_target_image = Input(shape=image_shape)merged = Concatenate()([in_src_image, in_target_image])d = Conv2D(64, (4,4), strides=(2,2), padding='same', kernel_initializer=init)(merged)d = LeakyReLU(alpha=0.2)(d)d = Conv2D(128, (4,4), strides=(2,2), padding='same', kernel_initializer=init)(d)d = BatchNormalization()(d)d = LeakyReLU(alpha=0.2)(d)d = Conv2D(256, (4,4), strides=(2,2), padding='same', kernel_initializer=init)(d)d = BatchNormalization()(d)d = LeakyReLU(alpha=0.2)(d)d = Conv2D(512, (4,4), strides=(2,2), padding='same', kernel_initializer=init)(d)d = BatchNormalization()(d)d = LeakyReLU(alpha=0.2)(d)d = Conv2D(512, (4,4), padding='same', kernel_initializer=init)(d)d = BatchNormalization()(d)d = LeakyReLU(alpha=0.2)(d)d = Con
Weighted Linear Regression
       
Linear Regression ModelWe start with the linear regression mathematical model. Weighted Linear RegressionWeighted linear regression is a generalization of linear regression where the covariance matrix of errors is incorporated in the model. One approach is provided here:Solve linear regression without covariance matrix (or solve weighted linear regression by setting C = I which is the same as linear regression)= which is the same as linear regression) Calculate the residualsEstimate the covariance from residualsSolve weighted linear regression using the estimated covariancePython ExampleIn this section, we provide a Python code snippet to run weighted linear regression for a heteroscedastic data and compare it with linear regression:In this code, we generate a set of synthetic data where the variance of the observation error is a function of the feature. Then, we run weighted linear regression and find the coefficientsResponse variable vs feature variable (image by author)Above chart shows that in the presence of heteroscedasticity, the weighted linear regression provides more accurate estimate for the regression coefficients. Weighted linear regression should be used when the observation errors do not have a constant variance and violate homoscedasticity requirement of linear regression.
An empirical approach to speedup your BERT inference with ONNX/Torchscript
       
Inference takes a relatively long time compared to more modest models and it may be too slow to achieve the throughput you need. First we’ll take a quick look at how to export a Pytorch model to the relevant format/framework, if you don’t want to read code you can skip to the results section further down. If you want to do inference both on CPU and GPU you need to save 2 different models. Inference time scales up roughly linearly with sequence length for larger batches but not for individual samples. Next stepsWhile these experiments have been run directly in Python, both Torchscript and ONNX models can be loaded directly in C++, this could provide an additional boost in inference speed.
Finding Hard Samples in Your Image Classification Dataset
       
Finding Hard Samples in Your Image Classification DatasetPhoto by Tim Foster on UnsplashSay you have a repository of data with millions of unlabeled images in it. In an image classification dataset, a hard sample could be anything from a cat that looks like a dog to a blurry resolution image. If you expect your model to perform well on these hard samples, then you may need to “mine” more examples of these hard samples to add to your training dataset. Exposing your model to more hard samples during training will allow it to perform better on those types of samples later on. Hard samples are useful for more than just training data, they are also necessary to include in your test set.
Data Scientists Aren’t Going Away Anytime Soon
       
Data Scientists Aren’t Going Away Anytime SoonI spent the last week reading about GPT-x models and how revolutionary they are going to be. The articles had already piqued my interest and I decided to train a model on my humble machine. This tells us about the proclivities of humans and how the same is indirectly applicable to programmers as well. You tell the machine what you need to accomplish and the machine does that for you. Automation has already been around the corner and has been trying to make a space for itself unsuccessfully till now.
Essential Pandas Every Data Scientist Should Know in 2021
       
Pandas is a library that we all stumble upon at some point in our day as a data scientist. df[~(df > 0.3)]A B C D0 0.187429 NaN NaN NaN1 NaN NaN NaN NaN2 0.229124 NaN NaN NaN df[df > 0.3 & df < 0.5]Error! df[(df > 0.3) & (df < 0.5)]A B C D0 NaN NaN NaN NaN1 0.35959 NaN NaN 0.3815952 NaN NaN NaN 0.483650 df[df < 0.3 | df < 0.5]Error! df[(df < 0.3) | (df < 0.5)]A B C D0 0.187429 NaN NaN NaN1 0.359590 NaN NaN 0.3815952 0.229124 NaN NaN 0.483650~, & and | correspond to logical NOT, AND and OR for Boolean indexing. df = pd.DataFrame(np.random.rand(3, 4), columns=['A', 'B', 'C', 'D'])A B C D0 0.187429 0.565957 0.611477 0.7475781 0.359590 0.548627 0.895516 0.3815952 0.229124 0.702323 0.878924 0.483650df[~(df > 0.3)]A B C D0 0.187429 NaN NaN NaN1 NaN NaN NaN NaN2 0.229124 NaN NaN NaN df[~(df > 0.3)].fillna(0)A B C D0 0.187429 0.0 0.0 0.01 0.000000 0.0 0.0 0.02 0.229124 0.0 0.0 0.0 df[~(df > 0.3)].fillna(0, limit=1)A B C D0 0.187429 0.0 0.0 0.01 0.000000 NaN NaN NaN2 0.229124 NaN NaN NaN df[~(df > 0.3)].fillna(method='ffill')A B C D0 0.187429 NaN NaN NaN1 0.187429 NaN NaN NaN2 0.229124 NaN NaN NaN df[~(df > 0.3)].fillna(method='pad')A B C D0 0.187429 NaN NaN NaN1 0.187429 NaN NaN NaN2 0.229124 NaN NaN NaN df[~(df > 0.3)].fillna(method='bfill')A B C D0 0.187429 NaN NaN NaN1 0.229124 NaN NaN NaN2 0.229124 NaN NaN NaN df[~(df > 0.3)].fillna(method='backfill')A B C D0 0.187429 NaN NaN NaN1 0.229124 NaN NaN NaN2 0.229124 NaN NaN NaN df[~(df > 0.3)].fillna(method='ffill', axis=1)A B C D0 0.187429 0.187429 0.187429 0.1874291 NaN NaN NaN NaN2 0.229124 0.229124 0.229124 0.229124fillna() function is used to replace NaN values in dataframes.
Do We Even Need DataFrames?
       
Often when we work with data in Data Science, our data can be as abstract as a filename. So with that in mind, there really isn’t a reason that we would necessarily need to be able to use a data frame in the same way that we use a matrix. That being said, while data frames do offer this flexibility, they are also a dependency and can be difficult to work with compared with traditional data-types. This is less true of the Pandas library for Python, in my opinion, as I think Pandas has handled the problem of data frames rather well. We need to consider that high-dimensional data is going to be a lot more difficult to deal with when working with key-pair combinations.
Delve Into Number-Crunching with Temporal Convolutional Network
       
A Temporal Convolutional Network (TCN) is a neural network that specializes in tackling a sequence data problem. It’s using Fully Convolutional Network (FCN) for the processing. What’s the difference between FCN and an ordinary convolutional layer? To get an understanding of the convolutional layer of a neural network, visualization is very important. So that why the first value in the output layer is more suitable to become the output of our model.
Object Detection with Keras and Determined
       
In this tutorial, you’ll start by installing Determined on AWS, and then modifying an off-the-shelf tf.keras object detection model to work with Determined. The model being used is based on the Object Detection with RetinaNet using Keras tutorial. Adapting RetinaNet To Work With DeterminedPorting a model to Determined involves making the model compatible with the Determined API. Determined provides an easy-to-use hyperparameter search interface to the user to automatically use the built-in algorithms, track and visualize the experiments. There are a variety of search algorithms provided by Determined, and you can use any of them to perform your hyperparameter search.
Adversarial Eyeglasses to Trick Facial Recognition
       
that describes a general framework for adversarial example generation and they utilize eyeglass frames affixed to people’s faces to trick a facial recognition classifier. Algorithm from paper (on left) — My ‘translation’ (on right) [Image by author]Finetuning the Facial Recognition ClassifierI utilized a pre-trained facial recognition model trained on VGGFace2 and then finetuned it to recognize my face. Check out my other article on finetuning a facial recognition classifier for more info. it can continue to output appropriate eyeglass frames, however, we also want those eyeglass frames to trick the facial recognition classifier. Now we should have a generator that outputs eyeglass frames that consistently trick the facial recognition classifier.
Geometric Models for Anomaly Detection in Machine Learning
       
Geometric Models for Anomaly Detection in Machine LearningImage by AuthorAnomaly Detection can be termed for the identification of rare items, events or observations which raise suspicions by differing significantly from the majority of the data [1]. This blog will reflect upon my learnings on Week 2 of Intel’s Anomaly Detection course that I have been doing lately. The variation in the magnitude of the angular enclosure comes out to be different for outliers and others, which becomes the metric for us to cluster normal and outlier points in different clusters. Read An Awesome Tutorial to Learn Outlier Detection in Python using PyOD Library, a comprehensive guide to starting with the library. It provides access to around 20 outlier detection algorithms with some ad