Hacking in the World of Artificial Intelligence

Current machine learning is a minefield of security issues, which is something that needs to be dealt with, fast!

A look at hacking in the world of artificial intelligence, and how to trick machine learning algorithms.


The newsletter will keep you informed of interesting projects in the fast-growing field of A.I. and machine learning.
Of course, we promise we will not spam you, and no shameless self-promotion.

PRO TIP: Highlight any text to share a quote easily!

One of the things we haven't talked about much is the concept of human intervention when it comes to the dangers of artificial intelligence, especially in its early stages, where we are now with the technology.
This may be far less of a "doomsday" or apocalyptic scenario, but the consequences could still be quite devastating on a person by person basis, or even affect larger groups depending on where a machine learning algorithm is deployed.



We want to look at very specific cases where reinforcement learning is deployed with public access, much like how you can tell Google Translate that a translation is incorrect, and submit your improvements to them.
Another good example would be marking an email that is not in your spam folder, but definitely belongs there, as such so the machine learning algorithm will become better over time.


Exploiting these technologies can be done in a variety of ways, and while I initially thought it would take a large group to skew the learning of a machine by flooding it with many badly labeled training examples, it would not be impossible to have this done by some kind of botnet.

See, most machine learning algorithms learn by training them on a so-called "labeled" data set, which is a large set of input data, and a label which is the desired perfect output of that input data.
The training data goes through the network, and the output of the network is then compared to the perfect labeled output.

If the output of the network is different than the label, the weights of the network are adjusted through an optimization algorithm, and the next training step is performed.

This is what I mean by injecting badly labeled data, if there is some way you can give the machine learning algorithm new training data, which you have deliberately labeled the wrong way, it will still try to perfectly learn this badly labeled data, and this will result in false predictions.

We have actually seen a very real-world example of this when Microsoft deployed their Twitter bot called Tay.

Tay was a chatbot deployed to Twitter with the idea that it would learn from the conversations it had with the reputable people on the social platform.
Unfortunately, in less than 24 hours, people had manipulated the bot to spout mostly racist hate-speech.

What we can glean from this experiment is that we should never underestimate how easy it is for the Internet to join forces and manipulate any big-data structures.
The same can be seen whenever some company launches an innocent Internet poll to determine the name of a new research vessel, though Boaty McBoatface was innocent enough.

There is no predicting how the masses will react in the end.

All this is just proof of concept though, what we are talking about when we say exploiting machine learning algorithms is not done by the masses, but by a small group, or individual with a large botnet, that wants to achieve a self-serving goal, whether the reward be financial gain, status, or political in nature.

Machine learning models are incredibly volatile structures still, and messing with the training data can results in very inaccurate or even skewed outputs, which isn't so bad in the case of a simple Twitter bot experiment, but if the data collected is of a more sensitive nature, things can get out of hand real quick.

I ran into this myself not so long ago while I was using Twitter to build up a corpus for training Word2Vec.
The idea was to plug into the Twitter stream and following the topic "refugees," as I was looking to build a related dictionary of words around the word "refugee."
This has something to do with a project I will be launching soon, so more about that in a future post.

In any case, this was right after one of the recent attacks in London, and because the attacker had parents who were refugees, which was something everybody had an opinion on, soon my results were skewed entirely the way of "terrorist," which not only didn't make any sense in the context I was looking for, it also messed up my desired output at the end of my model, so I had to find another way to build up the corpus.


Of course there is no real chance that malicious individuals would ever be able to exploit systems deployed by a company as large, well-staffed, and deeply integrated in the core technology as Google.
Yet, in a world where more and more companies, big or small, are deploying their own individual A.I. strategies we are going to have to assume that the tech will be misunderstood by some, and mistakes will be made.

Artificial intelligence is hot right now, all the rage, and for sure the new buzz-word, and if you have ever worked in tech in any kind of capacity, you know what that can do to people managing a company.

This is in no way different from companies just throwing up a cheap web application firewall, and thinking their security is taken care of, or people using 2-step authentication on their social media accounts, which was what left them vulnerable from social engineering attacks towards the cellphone provider's help desks.

This topic deserves as much attention as all the other negative scenarios around A.I. and I even submit that this is way more important in the beginning stages, because the singularity and such buzz-words are still very far ahead of us, yet machine learning is present more and more around us.

Follow me on social media to get updates and discuss A.I., and I have recently created a slack team to see if people are interested to forming a little A.I. community,

Twitter  : @ApeMachineGames
Slack    : join the team
facebook : person / facebook group




Applying Zipf's Law To Adaptive Text Compression

George Kingsley Zipf's law states that given a large sample of words used, the frequency of any word is inversely proportional to its rank in the frequency table. In this article we look at applying these techniques to adaptive text compression, which while a little of a naive approach, is actually a lot of fun, and a great learning experience.

China ai superiority

China's Plan For A.I. Superiority

So China has made the news recently when announcing their plans to be the world leader in artificial intelligence technology by the year 2030, and I for one am 100% not suprised at all by this news.

Think about it for a second, besides those freaky machines made at the Boston Dynamics labs which recently were in the spotlight—mostly because of the footage of them being kicked around by their creators, while subsequently failing to perform the challenges set for them outside of the lab by DARPA—what other country do you know to be often in the limelight when it comes to advances in robotics and machine learning?

Further more, I suspect any country that has money and human resources to spend on this will have the same goal as China and America, which is to become world leader in the most advanced technologies possible, and this has been going on since the dawn of man.

Optmize chatbot

5 Simple Ways To Optimize Your Chatbot (HowTo)

If you have ever messed around more with chatbot services like or, and got beyond the "order a pizza" example, you might be wondering: Why does my chatbot start messing up when I train it more?

We all know the scenario, starting with a simple bot, and adding the small-talk intent to it that all of them provide out of the box. Things are looking great so far!
Adding more intents and features to our bot is working out too, and we are soon dreaming of building a general solution to all of our customer service related problems.

What's more, many companies are sprouting up targetting businessnes big and small, promising them solutions for customer service, employee onboarding, and other tasks that sitll require some human finesse.

Sadly, things are not that simple...