First and foremost, what is machine learning, and why is it a good thing? Machine learning is a set of statistical/mathematical tools and algorithms for training a computer to perform a specific task, for example, recognizing faces.
Two important words here are “training” and “statistical.” Training because you are literally teaching the computer about a particular task. Statistical because the computer is working with probabilistic math, and the chances of it getting the answer “correct” varies with the type and complexity of the question that it’s being trained to answer.
There are a number of different types of machine learning algorithms, from the simple “Naïve Bayes” to “Neural Networks” to “Maximum Entropy” and “Decision Trees.” We’re more than happy to geek on out with you with respect to advantages and disadvantages of different types, and talk about linear vs. non-linear learning, feed-forward systems, or argue about multi-layer hidden networks vs. explicitly exposing each layer.
ClickEdge is a solutions-based company. We maintain dozens of both supervised and unsupervised machine learning models. (Close to 20, actually.) We have dozens of person-years dedicated to gathering data sets, experimenting with the state-of-the-art machine learning algorithms, and producing models that balance accuracy, broad applicability, and speed.
ClickEdge is not a general-purpose machine learning company. We are not providing you with generic algorithms that can be tuned for any machine-learning problem. We are entirely, completely, and totally focused on text. All of our machine learning algorithms, models, and techniques are optimized to help you understand the meaning of text content.
Text content requires special approaches from a machine learning perspective, in that it can have hundreds of thousands of potential dimensions to it (words, phrases, etc), but tends to be very sparse in nature (say you’ve got 100,000 words in common use in the English language, in any given tweet you’re only going to get say 10-12 of them). This differs from something like video content where you have very high dimensionality, but you have oodles and oodles of data to work with, so, it’s not quite as sparse.
Why is this an issue? Because how can you start grouping things together and seeing trends unless you can understand the similarities between content.
In order to deal with the specific complications of text, we use what’s called a “hybrid” approach. Meaning, that unlike pure-play machine learning companies, we use a combination of machine learning, lists, pattern files, dictionaries, and natural language algorithms. In other words, rather than just having a variety of hammers (different machine learning algorithms), we have a nice tool belt full of different sorts of tools, each tool optimal for the task at hand.
The “term du jour” seems to be “deep learning” – which is an excellent rebranding of “neural networks.” Basically, the way that deep learning works is that there are several layers that build up on top of each other in order to recognize a whole. For example, if dealing with a picture, layer 1 would see a bunch of dots, layer 2 would recognize a line, layer 3 would recognize corners connecting the lines, and the top layer would recognize that this is a square.
This explanation is an abstraction of what happens inside of deep learning for text – the internal layers are opaque math. We have taken a different approach that we believe to be superior to neural networks/deep learning – explicitly layered extraction. We have a multi-layered process for preparing the text that helps reduce the sparseness and dimensionality of the content – but as opposed to the hidden layers in a deep learning model, our layers are explicit and transparent. You can get access to every one of them and understand exactly what is happening at each step.
To give an idea of the machine learning models we have, just to process a document in English, we have the following machine-learning models:
All of those models help us deal with that dimensionality/sparseness problem listed above. Now, we have to actually extract stuff, so, we’ve got additional models for
For other languages, like Mandarin Chinese, we have to actually figure out what a word is, so, we need to “tokenize” – which is another machine learning task.
Some of our customers, particularly in the market analytics space and the customer experience management space have been hand-coding categories of content for years. This means that they have a lot of content that is bucketed into different categories. Which means that they have a really great set of content for training a machine-learning based classifier – we can do that for you too!
Fully satisfied with the services that has been rendered to us besides so many of odds arose out of the COVID19 Pandemic.
Very professional and very specific driving the value for money, good Mac, I like your delivery style. Result oriented.
Clickedge was a discovery for us for our digital marketing , branding and website needs. Some of the best professionals at one place. The director Mr.Sambaran ensures to solve and help you every step with marketing and promotions. Wishing them all the best and wish that they keep growing.
It’s been a great association with ClickEdge for our brand promotion. We are very happy with the kind of dedication they show towards our Digital marketing campaigns month on month.
We wish them all the best and will highly recommend them to one who is looking for digital marketing, brand promotion, website designs, etc.