DATE 01/01/2018 Introduction Digginginformation from the pool of data is termed as data mining.
There is humungousdata available in the information industry that is useless unless convertedinto beneficial information and analyzed to discover any fraudulence, buyer’schoice, to control the manufacturing of products and understand the marketbetter.Datamining helps the entrepreneurs to know their customers better in a way of theirchoices, the deals for their money, their income and criteria by which theylike to spend. It also gives an idea how often a customer likes to spend andmakes one capable to relate different people with similar choices. Apartfrom these it also assists in cooperate sector. Datamining is categorized as “Descriptive and Classification and production” on thebasis of the type of the data.
1. Descriptive function It describes the basic feature of information in database such as:-Class/concept description-Mining of frequent patterns-Mining of association-Mining of correction-Mining of clustersCLASS/CONCEPT DESCRIPTIONClass- The products to be sold by the company, for example, clothes.Concept- The money being spent by the customer, shoppers or the ones who buy inbudget.They can be gathered intwo ways:- Data Characterization: Review the data of the class to be studied namely the ‘Targetclass’- Data Discrimination: Comparison of the class with a designated class.
MINING OF FREQUENT PATTERNSThe products (patterns) that usually are seen in transactional data aretermed as frequent patterns.- Frequent item set: The products that are enlisted with one another such astop and bottom wear in clothing section.- Frequent sub sequence: The products that are generally bought with the mainitem such as buying pet food followed by pet treats.
-Frequent sub structure: Graphs, trees or various other structural forms thatare attached to sub sequences.MINING OF ASSOCIATIONThe item that are generally bought together are included in this category. Withthe help of this a businessman discovers a percentage of association betweenproducts bought together such as 60 percent of times a mobile phone is boughtwith a mobile cover and 40 percent of times with screen guards.MINING OF CORRELATION It reveals the effect of purchase of one product over another whether it has anegative, positive or no effect at all. MINING OF CLUSTERSIt is grouping the like similar products from one another. Each clustervaries from the other.
2. Classification andpredictionTheclass label of some items may be unknown. Classification and prediction is onesuch procedure that can be utilized to uncover the data class or concepts.This procedure is presented as: (a) Classification (If-Then) rules(b) Decision trees(c) Mathematical formulae(d) Neural networks FUNCTIONS: -Classification: Derivingmodel that differentiates the class or concept of the information.
This modelis based on the object with a well known class label.- Prediction: Regression analysis is brought to practice to predict thenumerical values that are unknown rather than the class label. Also it is usedto identify sale trends on the basis of data available.-Outlier analysis: The data that does not abide by the model of data availableis an outlier data.
-Evolution analysis: Itrefers to those subjects which are transitional in nature.HOW DOES THE CLASSIFICATION WORK? Itincorporates two stages: -Building the classifier or model- Using classifier for classification BUILDING THE CLASSIFIER-It is alearning step-ordercalculations assemble the classifier-set made fromdatabase tuples and related class labels-each type iscalled as classification or class are known as test/question or informationpoints. USING THE CLASSIFIERClassifier is utilized for arrangements that includeanalyzing the relevance and exactness of characterization rules and thuslinking the older and new information tuples if considered adequate. DATA MINING TASKPRIMITIVES