Since the rise of computers, the Internet, and technology in the Third Industrial Revolution, there has been an explosion in the amount of data generated. The term ‘data’ is itself, a buzzword which means information in a raw or unorganized form (such as alphabets, numbers, or symbols) that refer to, or represent, conditions, ideas, or objects. Read more here. This is the only revolution that is taking place at a time when Africa is also capable of joining the race. But we are still in the dark. And as time has always proved us wrong, Africa is again lagging by at least half a decade or so. I wish to shed some light on this area for Africa must rise quickly.
Disclaimer: Data is not only numbers!
In fact, 90% of data about any particular object has got nothing to do with numbers. Data is descriptive in nature. Anything that describes any object is data about that particular object.
Take a tree for example, the name and species (say Acacia and Vachellia Genus) of that tree is data about it in as much as its height, age, girth, type of roots, number and color of leaves, environmental conditions for its growth (humidity, temperature, acidity) is data about it. People often mistake data as numbers alone; some integers and floating decimal points that take the life out of the object in question, Yuck! Spare the object some life, for Pete’s sake! Am I an alien or does it also occur to you that saying “5 feet 3, 23years”, literally indicates nothing about your object of study? At least adding ‘descriptive’ language, say, “blonde, catholic, plump, she”, gives life to your moody girlfriend, doesn’t it?. That is the real data. Forget about statistics, it’s part of data but data is in itself, everything about anything or should I say data is a human being creation tool? Come to think of it, the absence of data about a particular object renders it non-existent. For there is nothing for the philosophers to debate about.
Data becomes Big or should I say ‘Big Data’
Clearly from the above paragraph, data is limitless and present everywhere in the universe. The act of collecting, storing, and analyzing data have been around for a very long time since Mesopotamia. What has changed, however, is the size and complexity of the data itself. The data is so large, fast or complex that it’s difficult or impossible to process using traditional methods. This alone is the definition of big data.
The three V's of big data:
- Volume: Data comes from a myriad of sources e.g business transactions, Industrial equipment, Smart IoT devices, videos and social media, phone calls, and SMS, etc. What is beautiful about volume is that it’s set to explode even further. Majorly because of the organization being brought into it. Especially into the Webspace. We call it Web 3.0 aka the Semantic web. Notable examples of this organization are the Web of Things (WoT).
- Velocity: There is always data streaming into an organization, whether or not sales were made. Look into the events of the day, say there was a students’ strike that prompted businesses to close hence curtailing sales. In there (the cause and effects), you will find priceless data that will prepare you for the future. With the growth of IoT, the speeds at which data will be streamed in will be unheralded. But these must be handled in a timely manner.
- Variety: Data comes in all types of formats – from structured, numeric data in traditional databases to unstructured text documents, emails, videos, audios, stock ticker data, financial transactions and fuzzy events that spew data if investigated.
Why care about the details?
If your stock turnover ratio is good and indeed you are making worthy profits and expanding your business, why should you care about the meager details anyway? I don’t have to stress this any further, Data yields Information which yields Knowledge which yields Wisdom. This and this alone is the element of control. A business must always have the following facts in hand:
- What is my current and history of net worth (Assets-Liabilities)? Have a graphical visualization of the same.
- What, how and why are the forces that influence my business performance (Social, Political, Economic) say schools closing, elections, fuel prices. How did it influence my business in this financial period? Sun Tzu would put it, “Foreknowledge cannot be gotten from ghosts and spirits, cannot be had by analogy, cannot be found out by calculation. It must be obtained from people, people who know the conditions of the enemy.” You must study the forces in play.
- What are the projections of these forces in the near future? Example, say the city council expects to relocate the main bus stage in the next 3 years, how will this affect my business? Should I also expect to relocate?
- Based on your Performance trends, what projections on the same can you make? What figures should you set as the next goal?
I call these the Worthy Forces that Project your Goals. To take into account, none of this can be handled without data. You need data. Any reputable business should have a Business Intelligence (BI) Department. Let them analyze the data and spew back insights and foreknowledge. Many big corporations will kill for data. In fact, the Internet giants say, Google, Facebook or Twitter are based entirely on data. These companies collect massive profits from the intelligence gathered from the data you conveniently give to them. They don’t even have to sell the intel, other corporations are ready and willing to buy the raw data. They will produce their own intelligence and use it however they see fit.
What is important
It doesn’t matter how much massive datasets you have. What counts is what you do with your data. Collecting data and analyzing it would enable you to find answers to big Qs in your business like:
- Cost reductions
- Time Management
- New product development and optimized offerings
- Smart decision making.
The Big Data Landscape: a look into Africa.
To identify the potential of big data, one must first raise awareness on the same. After which, resources should be availed for the mass to learn about the discipline. Once imparted with the skills, one is able to imaginatively handle data and creatively provide solutions never thought of before. In essence, education gives one the power to create. Over the past decade, virtually every university in Europe and North America has responded to the challenges and opportunities of data science by establishing new institutes, departments, and degree programs in the field. Africa, on the other hand, is usually plagued with the same problem over and over again, time lag.
Only recently have we seen some responses by some African institutions and organizations to introduce the discipline to the masses. Notably, some are creating structures, networks and training programs to stimulate research and capacity development in the subject area. Examples include the African Center of Excellence in Data Science in Rwanda, the AI & Data Science Research Group at Makerere University in Uganda, Data Science Africa, and the Deep Learning Indaba. Albeit we are trying, Africa has 54 independent countries with multiple institutions of higher learning. Is it that these institutions aren’t mindful of the continent’s specific needs and realities? It really boggles me that some African universities are busy scraping off already established courses. Were these courses established without any particular goal in mind? Was it just for money or were they prompted by any need to fulfill one of the continent’s goals? Who accredited the courses, to begin with?
Also to note, a higher percentage of the African population hasn’t attended universities. The few that have attended the same, were not imparted with skills appropriately. Even though data science is a blend of statistics, computer science, mathematics, and engineering, it’s also a subject matter knowledge. If we can’t take our youth to the university, then why don’t we take the universities to them? Data science is a subject learned best in the practical real-world environment rather than the typical classroom set up. We need online MOOCs by Africa for Africa. Notable examples are Data science Nigeria, Data Science Africa. Filled with videos, webinars, quizzes, tutorials, assignments, exams, practical labs and research work fuelled with connectivity to the sources of data itself, these Moocs are our only answer to quicken our hopes of ever catching up with the rest of the world.
To begin with, data must be mined from wherever it resides e.g. social media, agricultural farm, your daily business interaction. There are already tools that exist for mining data from whichever platform one wants. It’s just so unfortunate that 99.9% of these tools aren’t made by Africans. Even web scraping tools that are just frameworks built with famous programming languages have their origins offshores of Africa. We have a software engineering, computer science, computer engineering, and other IT courses in our universities. However, they are being taught in a shallow manner by shallow thinking professors such that they literally leave the student with no power to create anything out of them. Should we rebrand them ‘crash courses’?
Secondly, After data is mined, it has to be stored. We have heard of storage tools like Hadoop, MongoDB, couch Db, etc. Some of them come as stand-alone services while others also come with their cloud versions. Where are these storage engines hosted for the databases? Where exactly are the hard disk racks housed? Who owns them? Africa needs her own data centers located within and managed by her own people. I am aware we own some, but at least make as aware that they are located in Africa or aren’t you proud to be hosting in Africa? Will Africans shun away from your services just because you are ‘local’? Most will, and this must change. But it won’t change if you also offer shoddy services then complain about inadequate resources.
To end with, the stored data must be analyzed, visualized and interpreted. Tools like Scikitlearn, Pandas, and R exist for doing the same. Even though there is no point in reinventing the wheel by creating African tools, I look forward to seeing one made in Africa by Africa and not necessarily for Africa.
Finally, we need problems to solve. Africa, unfortunately, houses a myriad of problems from food and health issues, big economy crisis, problems with our own elections and census data, a massive open space in agriculture, industrial problems that require data insights, infrastructure problems, tourism, corruption, tribal problems that can be solved if looked into and many more. Africa is just so blessed with environments on which any business can thrive, cause we are just so bare. The only thing we lack is the incentives to look into the problems. The general populace needs money to survive. Big corporations, on the other hand, need insights into data. If big corporations cooperate with the public for the exchange of services, then a lot of problems would be solved. I admire projects such as Zindi which is a community of data scientists solving Africa's toughest challenges. It’s like a bug bounty platform for Africa’s data science problems. Such platforms are what we need.
To end with, Africa is nowhere near young, as far as I am concerned, we are the cradle of mankind. If an innovative country like India is branded as the third world, then I can assure you that Africa is a fourth world country. Ignorance is killing us. Africa is so chained to an outdated rhythm of life. The only problem is we, like zombies, are dancing to the tune. I thought Katty Perry’s Chained to the rhythm was enough warning for you. The rest of the world isn’t advanced at all, they are exactly where humanity ought to be at this age of enlightenment. Africa is...