Posts

Showing posts from September, 2013

Analysis for Nano Science/ Nano Technologies

Benzene (C6H6) is the last fundamental chemical compound to have had is atomic structure uncovered. This discovery led the path to Creating large numbers of complex synthetic molecules using computer simulations and deep mathematics (which in turn led to the explosion of synthetic drugs in big pharma) Creating very useful but weird carbon molecules using computer simulations and deep mathematics (wich in turn led to the explosion of nanotechnologies) Big innovation in combinatorial chemistry today is rooted in the discovery and understanding of complex, unusual, bizarre atomic structures such as Benzene, and the application of advanced analytic principles. Let's start with a scratch course in chemistry. Then I'll explain how analytics helps  create these incredibly powerful and useful new technologies. 1. Basic Chemistry Tutorial Molecules are made of atoms. Atoms are the "prime number" entities that generate all the molecules. There are about 100 types

Fast Clustering Algorithm for big data "Mapreduce" "Can be Neuro-Science too as Big Data"

 we can perform clustering extremely fast, on big data sets, as well as the graphical representation of such complex clustering structures. By extremely fast, we mean a computational complexity of order   O(n) and even faster such as   O(n/log n) . This is much faster than good Hierarchical Agglomerative Clustering   which are typically O(n^2 log n) . By big data, we mean several millions, possibly a billion observations. Potential applications : Creating a keyword taxonomy to categorize the entire universe of cleaned (standardized), valuable English keywords. We are talking of about 10 million keywords made up of one, two or three tokens, that is, about 300 times the number of keywords found in a good English dictionary. The purpose might be to categorize all bid keywords that could be purchased by eBay and Amazon on Google (for pay-per-click ad campaigns), to better price them. This is the application discussed in this article. Clustering millions of documents (e.g. books on A

Real Big Data From Human Genetics & Neuro-Science

Image
The genome is the code of life. It works similar to a computer code where you punch in the functionalities and rules for the end software like Excel or Word. The genome is made up of a chemical called DNA – D eoxyribo N ucleic   A cid. The fundamental unit of a computer code is binary i.e. it can take two values 0 or 1. Similarly, DNA’s fundamental unit has four values A, T, G and C – recorded in terms of base pairs. The human genome has approximately 3 billion base pairs. Now when we say that there is around 1% difference in the genomes of two different humans it means 30 million base pairs of difference. Additionally, since each of these 30 million base pairs can take four values (A, T, G and C). This means, we could pack information in (30 million) 4   or    different ways in two humans. This number is significantly more than the number of stars in the universe. To put this in perspective, Windows-7 occupies around 10 Gigabytes of space on the computer. 10 GB is about 8.6 billion