ApacheHiveEssentialspdf
Apache hive essentialswww.it-ebooks.infoTable of contentsApache Hive EssentialsCreditsabout the authorAbout the reviewerswww.packtpub.comSupport files, eBooks, discount offers, and moreWhy subscribe?Free access for packt account holdersPrefaceWhat this book coversWhat you need for this bookWho this book is forConventionsReader feedbackCustomer supportDownloading the example codeErrataPiracyuestions1. Overview of Big Data and HiveA short historyIntroducing big dataRelational and NosQL database versus HadoopBatch, real-time, and stream processingOverview of the Hadoop ecosystemHive overviewSummary2. Setting Up the Hive EnvironmentInstalling Hive from Apachewww.it-ebooksinfoInstalling Hive from vendor packagesStarting Hive in the cloudUSing the hive command line and BeelineThe Hive-integrated development environmentSummary3. Data Definition and DescriptionUnderstanding hive data typesData type conversionsHive Data Definition LanguageHive databaseHive internal and external tablesHive partitionsHive bucketsHive viewsSummar4. Data Selection and ScopeThe select statementThe inner join statementThe outer join and cross join statementsSpecial Join- mapjoinSet ooperation1-UNION ALLSummaryData manipulationData exchange-LOADData exchange-INSERTData exchange-EXPORT and IMPORTORDER and sortOperators and functionsTransactionsSummary6. Data Aggregation and Samplingwww.it-ebooks.infoBasic aggregation-GROUP BYAdvanced aggregation-GROUPING SETSAdvanced aggregation-ROLLUP and CUBEAggregation condition- HAVINGAnalytic functionsSamplingSummary7. Performance considerationsPerformance utilitiesThe EXPlain statementThe analyze statementDesign optimizationPartition tablesBucket tablesIndexData file optimizationFile formatCompressionStorage optimizationJob and query optimizationLocal modeJVMreuseParallel executionJoin optimizationCommon joinMap joinBucket map joinSort merge bucket(smb)ioinSort merge bucket map(SMBM)joinSkew joinSummarywww.it-ebooks.info8. Extensibility ConsiderationsUser-defined functionsThe udF code templateThe udaF code templateThe UDt code templateDevelopment and deploymentStreamingSerDeSummary9. Security ConsiderationsauthenticationMetastore server authenticationHive server2 authenticationAuthorizationegacy modeStorage-based modeSOL Standard-based modeEncryptionSummary10. Working with Other ToolsDBC/ODBC connectorHBaseHueCAtalogZooKeeperOozieHive roadmapSummaryIndexwww.it-ebooks.infowww.it-ebooks.infoApache hive essentialswww.it-ebooks.infowww.it-ebooks.info
用户评论