Hive provides SQL-like query language on HDFS(Hadoop Distributed File System)
Hive defines a simple SQL-like query language, called QL, that enables users familiar with SQL to query the data. At the same time, this language also allows programmers who are familiar with the MapReduce framework to be able to plug in their custom mappers and reducers to perform more sophisticated analysis that may not be supported by the built-in capabilities of the language. QL can also be extended with custom scalar functions (UDF’s), aggregations (UDAF’s), and table functions (UDTF’s).
Hive Query Language provides following features
Basic SQL
- From clause subquery
- ANSI JOIN (euqi-joini only)
- Multi-table Insert
- Multi group-by
- Sampling
- Objects traversal
Extensibility
- Pluggable MapReduce scripts in the language of your choice using TRANSFORM (Syntax changing soon!!)
- Pluggable User Defined Functions
- Pluggable User Defined Types
- Pluggable SerDes to read different konds of Data Formats
See below example of Hive query language. Amaging thing is Hiveis compatible with standard SQL.
SELECT pageid, COUNT(DISTINCT userid)
FROM page_view GROUP BY pageid
It is almost the same as the usual RDB SQL. This is really great feature of Hive so programmers having experiences in RDB can implement software easily.
Hive does not mandate read or written data be in the "Hive format"—there is no such thing. Hive works equally well on Thrift, control delimited, or your specialized data formats. Please see File Format and SerDe in the Developer Guide for details.
Hive is not designed for OLTP workloads and does not offer real-time queries or row-level updates. It is best used for batch jobs over large sets of append-only data (like web logs). What Hive values most are scalability (scale out with more machines added dynamically to the Hadoop cluster), extensibility (with MapReduce framework and UDF/UDAF/UDTF), fault-tolerance, and loose-coupling with its input formats.
Following is Data Model for Hive.
References
https://cwiki.apache.org/confluence/display/Hive/Home
Hive ApacheCon 2008, New Oreleans, LA (Ashish Thusoo, Facebook)
Apache HBase is a storage system, with roots in Hadoop, and uses HDFS for underlying storage.
Apache HBase is a storage system, with roots in Hadoop, from which it gets its "H". Though HBase uses HDFS for underlying storage, HBase is designed much more for fast and frequent access to blobs of binary data.
It is an example of what most would call a NoSQL column-oriented store; it holds semi-structured values for keys.
Below is the reference architecture based on HDFS, MapReduce, and HBase.
MapReduce might be used for parallel processing to calculate something. I will search much detailed knowledge in the future for HBase to make better understanding.
Reference
http://www.acunu.com/blogs/sean-owen/hadoop-universe/
http://hortonworks.com/technology/hortonworksdataplatform/
HDFS(Hadoop Distributed File System) is designed to run on commodity hardware – Low cost hardware
The Hadoop Distributed File System (HDFS) is a distributed file system designed to run on commodity hardware. It has many similarities with existing distributed file systems. However, the differences from other distributed file systems are significant.
HDFS is highly fault-tolerant and is designed to be deployed on low-cost hardware.
HDFS provides high throughput access to application data and is suitable for applications that have large data sets. HDFS relaxes a few POSIX requirements to enable streaming access to file system data.
HDFS was originally built as infrastructure for the Apache Nutch web search engine project. HDFS is now an Apache Hadoop subproject. The project URL is http://hadoop.apache.org/hdfs/.
The goal of HDFS
- Hardware failure is the norm rather than the exception.
- Streaming Data Access
- Large Data Sets
- Simple Coherency Model
- Moving Computation is Cheaper than Moving Data
- Portability Across Heterogeneous Hardware and Software Platforms
Data Replication
HDFS is designed to reliably store very large files across machines in a large cluster.
MapReduce Software Framework
Offers clean abstraction between data analysis tasks and the underlying systems challenges involved in ensuring reliable large-scale computation.
- Processes large jobs in parallel across many nodes and combines results.
- Eliminates the bottlenecks imposed by monolithic storage systems.
- Results are collated and digested into a single output after each piece has been analyzed.
References
http://hadoop.apache.org/common/docs/current/hdfs_design.html
http://www.cloudera.com/what-is-hadoop/hadoop-overview/
http://www.infoq.com/articles/data-mine-cloud-hadoop
Hadoop MapReduce is a software framework for processing vast amounts of data in-parallel on large clusters
Hadoop MapReduce is a programming model and software framework for writing applications that rapidly process vast amounts of data in parallel on large clusters of compute nodes.
In other words, Hadoop MapReduce is a software framework for easily writing applications which process vast amounts of data (multi-terabyte data-sets) in-parallel on large clusters (thousands of nodes) of commodity hardware in a reliable, fault-tolerant manner.

A MapReduce job usually splits the input data-set into independent chunks which are processed by the map tasks in a completely parallel manner. The framework sorts the outputs of the maps, which are then input to the reduce tasks. Typically both the input and the output of the job are stored in a file-system. The framework takes care of scheduling tasks, monitoring them and re-executes the failed tasks.

Typically the compute nodes and the storage nodes are the same, that is, the MapReduce framework and the Hadoop Distributed File System (see below HDFS Architecture Diagram) are running on the same set of nodes. This configuration allows the framework to effectively schedule tasks on the nodes where data is already present, resulting in very high aggregate bandwidth across the cluster.

The MapReduce framework consists of a single master JobTracker and one slave TaskTracker per cluster-node. The master is responsible for scheduling the jobs’ component tasks on the slaves, monitoring them and re-executing the failed tasks. The slaves execute the tasks as directed by the master.
Minimally, applications specify the input/output locations and supply map and reduce functions via implementations of appropriate interfaces and/or abstract-classes. These, and other job parameters, comprise the job configuration. The Hadoop job client then submits the job (jar/executable etc.) and configuration to the JobTracker which then assumes the responsibility of distributing the software/configuration to the slaves, scheduling tasks and monitoring them, providing status and diagnostic information to the job-client.
Although the Hadoop framework is implemented in Java, MapReduce applications need not be written in Java.
Hadoop Streaming is a utility which allows users to create and run jobs with any executables (e.g. shell utilities) as the mapper and/or the reducer.
Hadoop Pipes is a SWIG- compatible C++ API to implement MapReduce applications (non JNI based).
References
- http://hadoop.apache.org/common/docs/current/mapred_tutorial.html#Inputs+and+Outputs
Apache Hadoop is designed to scale up from single servers to thousands of machines
The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using a simple programming model.

It is designed to scale up from single servers to thousands of machines, each offering local computation and storage.
Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-availabile service on top of a cluster of computers, each of which may be prone to failures.

The above yellow elephant is the mascot for Hadoop.
Table of International Country Code, Time Zones, And Dialing prefix lookup
Here’s the information for international country code, time zones, and dialing prefix.
|
Country |
International dial code |
Start GMT |
End GMT |
|
Albania |
355 |
GMT+01:00 |
|
|
Algeria |
213 |
GMT |
|
|
Andorra |
376 |
GMT+01:00 |
|
|
Angola |
244 |
GMT+01:00 |
|
|
Anguilla |
264 |
GMT-04:00 |
|
|
Antigua and Barbuda |
268 |
GMT-04:00 |
|
|
Argentina |
54 |
GMT-03:00 |
|
|
Armenia |
374 |
GMT+04:00 |
|
|
Aruba |
297 |
GMT-04:00 |
|
|
Ascension Island |
247 |
GMT |
|
|
Australia |
61 |
GMT+10:00 |
GMT+07:00 |
|
Austria |
43 |
GMT+01:00 |
|
|
Azerbaijan |
994 |
GMT+04:00 |
|
|
Bahamas |
242 |
GMT-05:00 |
|
|
Bahrain |
973 |
GMT+03:00 |
|
|
Bangladesh |
880 |
GMT+06:00 |
|
|
Barbados |
246 |
GMT-04:00 |
|
|
Belarus |
375 |
GMT+03:00 |
|
|
Belgium |
32 |
GMT+01:00 |
|
|
Belize |
501 |
GMT-06:00 |
|
|
Benin |
229 |
GMT+01:00 |
|
|
Bermuda |
441 |
GMT-04:00 |
|
|
Bhutan |
975 |
GMT+05:30 |
|
|
Bolivia |
591 |
GMT-04:00 |
|
|
Bosnia |
387 |
GMT+01:00 |
|
|
Botswana |
267 |
GMT+02:00 |
|
|
Brazil |
55 |
GMT-03:00 |
GMT-05:00 |
|
Brunei |
673 |
GMT+08:00 |
|
|
Bulgaria |
359 |
GMT+02:00 |
|
|
Burkina Faso |
226 |
GMT |
|
|
Burundi |
257 |
GMT+02:00 |
|
|
Cambodia |
855 |
GMT+07:00 |
|
|
Cameroon |
237 |
GMT+01:00 |
|
|
Canada |
1 |
GMT-04:00 |
GMT-08:00 |
|
Cape Verde Islands |
238 |
GMT-01:00 |
|
|
Cayman Islands |
345 |
GMT-05:00 |
|
|
Central Africa Republic |
236 |
GMT+01:00 |
|
|
Chad |
235 |
GMT+01:00 |
|
|
Chile |
56 |
GMT-04:00 |
|
|
China |
86 |
GMT+08:00 |
|
|
Columbia |
57 |
GMT-05:00 |
|
|
Comoros Island |
269 |
GMT+03:00 |
|
|
Congo |
242 |
GMT+01:00 |
|
|
Cook Islands |
682 |
GMT-10:00 |
|
|
Costa Rica |
506 |
GMT-06:00 |
|
|
Croatia |
385 |
GMT+01:00 |
|
|
Cuba |
53 |
GMT-03:00 |
|
|
Cyprus |
357 |
GMT+02:00 |
|
|
Czech Republic |
420 |
GMT+01:00 |
|
|
Democratic Republic of Congo (Zaire) |
243 |
GMT+02:00 |
GMT+01:00 |
|
Denmark |
45 |
GMT+01:00 |
|
|
Diego Garcia |
246 |
GMT+05:00 |
|
|
Djibouti |
253 |
GMT+03:00 |
|
|
Dominica Islands |
767 |
GMT-04:00 |
|
|
Dominican Republic |
809 |
GMT-04:00 |
|
|
Ecuador |
593 |
GMT-05:00 |
|
|
Egypt |
20 |
GMT+02:00 |
|
|
El Salvador |
503 |
GMT-06:00 |
|
|
Equatorial Guinea |
240 |
GMT+01:00 |
|
|
Eritrea |
291 |
GMT+03:00 |
|
|
Estonia |
372 |
GMT+03:00 |
|
|
Ethiopia |
251 |
GMT+03:00 |
|
|
Faeroe Islands |
298 |
GMT |
|
|
Falkland Islands |
500 |
GMT-04:00 |
|
|
Fiji Islands |
679 |
GMT+12:00 |
|
|
Finland |
358 |
GMT+02:00 |
|
|
France |
33 |
GMT+01:00 |
|
|
French Guiana |
594 |
GMT-04:00 |
|
|
French Polynesia |
689 |
GMT-10:00 |
|
|
Gabon |
241 |
GMT+01:00 |
|
|
Georgia |
995 |
GMT+04:00 |
|
|
Germany |
49 |
GMT+01:00 |
|
|
Ghana |
233 |
GMT |
|
|
Gibraltar |
350 |
GMT+01:00 |
|
|
Greece |
30 |
GMT+02:00 |
|
|
Greenland |
299 |
GMT-03:00 |
|
|
Grenada |
473 |
GMT-04:00 |
|
|
Guadeloupe |
590 |
GMT-04:00 |
|
|
Guam |
671 |
GMT+10:00 |
|
|
Guatemala |
502 |
GMT-06:00 |
|
|
Guinea Bissau |
245 |
GMT-01:00 |
|
|
Guinea Republic |
224 |
GMT |
|
|
Guyana |
592 |
GMT-03:00 |
|
|
Haiti |
509 |
GMT-05:00 |
|
|
Honduras |
503 |
GMT-06:00 |
|
|
Hong Kong |
852 |
GMT+08:00 |
|
|
Hungary |
36 |
GMT+01:00 |
|
|
Iceland |
354 |
GMT |
|
|
India |
91 |
GMT+05:30 |
|
|
Indonesia |
62 |
GMT+09:00 |
GMT+07:00 |
|
Iran |
98 |
GMT+03:30 |
|
|
Iraq |
964 |
GMT+03:00 |
|
|
Ireland |
353 |
GMT |
|
|
Israel |
972 |
GMT+02:00 |
|
|
Italy |
39 |
GMT+01:00 |
|
|
Ivory Coast |
225 |
GMT |
|
|
Jamaica |
876 |
GMT-05:00 |
|
|
Japan |
81 |
GMT+09:00 |
|
|
Jordan |
962 |
GMT+02:00 |
|
|
Kazakhstan |
7 |
GMT+06:00 |
|
|
Kenya |
254 |
GMT+03:00 |
|
|
Kiribati |
686 |
GMT+12:00 |
|
|
Korea, North |
850 |
GMT+09:00 |
|
|
Korea, South |
82 |
GMT+09:00 |
|
|
Kuwait |
965 |
GMT+03:00 |
|
|
Kyrgyzstan |
996 |
GMT+06:00 |
|
|
Laos |
856 |
GMT+07:00 |
|
|
latvia |
371 |
GMT+03:00 |
|
|
Lebanon |
961 |
GMT+02:00 |
|
|
Lesotho |
266 |
GMT+02:00 |
|
|
Liberia |
231 |
GMT |
|
|
Libya |
218 |
GMT+02:00 |
|
|
Liechtenstein |
423 |
GMT+01:00 |
|
|
Lithuania |
370 |
GMT+02:00 |
|
|
Luxembourg |
352 |
GMT+01:00 |
|
|
Macau |
853 |
GMT+08:00 |
|
|
Macedonia (Fyrom) |
389 |
GMT+01:00 |
|
|
Madagascar |
261 |
GMT+03:00 |
|
|
Malawi |
265 |
GMT+02:00 |
|
|
Malaysia |
60 |
GMT+08:00 |
|
|
Maldives Republic |
960 |
GMT+05:00 |
|
|
Mali |
223 |
GMT |
|
|
Malta |
356 |
GMT+01:00 |
|
|
Mariana Islands |
670 |
GMT+10:00 |
|
|
Marshall Islands |
692 |
GMT+10:00 |
|
|
Martinique |
596 |
GMT-04:00 |
|
|
Mauritius |
230 |
GMT+04:00 |
|
|
Mayotte Islands |
269 |
GMT+03:00 |
|
|
Mexico |
52 |
GMT-06:00 |
GMT-08:00 |
|
Micronesia |
691 |
GMT+10:00 |
|
|
Moldova |
373 |
GMT+03:00 |
|
|
Monaco |
377 |
GMT+01:00 |
|
|
Mongolia |
976 |
GMT+08:00 |
|
|
Montserrat |
664 |
GMT-04:00 |
|
|
Morocco |
212 |
GMT |
|
|
Mozambique |
258 |
GMT+02:00 |
|
|
Myanmar (Burma) |
95 |
GMT+06:30 |
|
|
Namibia |
264 |
GMT+02:00 |
|
|
Nauru |
674 |
GMT+12:00 |
|
|
Nepal |
977 |
GMT+05:30 |
|
|
Netherlands |
31 |
GMT+01:00 |
|
|
Netherlands Antilles |
599 |
GMT-04:00 |
|
|
New Caledonia |
687 |
GMT+11:00 |
|
|
New Zealand |
64 |
GMT+12:00 |
|
|
Nicaragua |
505 |
GMT-06:00 |
|
|
Niger |
227 |
GMT+01:00 |
|
|
Nigeria |
234 |
GMT+01:00 |
|
|
Niue Island |
683 |
GMT-11:00 |
|
|
Norfolk Island |
672 |
GMT+11:30 |
|
|
Norway |
47 |
GMT+01:00 |
|
|
Oman |
968 |
GMT+04:00 |
|
|
Pakistan |
92 |
GMT+05:00 |
|
|
Palau |
680 |
GMT+09:00 |
|
|
Palestine |
970 |
GMT+02:00 |
|
|
Panama |
507 |
GMT-05:00 |
|
|
Papua New Guinea |
675 |
GMT+10:00 |
|
|
Paraguay |
595 |
GMT-04:00 |
|
|
Peru |
51 |
GMT-05:00 |
|
|
Philippines |
63 |
GMT+08:00 |
|
|
Poland |
48 |
GMT+01:00 |
|
|
Portugal |
351 |
GMT+01:00 |
|
|
Puerto Rico |
787 |
GMT-04:00 |
|
|
Qatar |
974 |
GMT+03:00 |
|
|
Reunion Island |
262 |
GMT+04:00 |
|
|
Romania |
40 |
GMT+02:00 |
|
|
Russia |
7 |
GMT+03:00 |
|
|
Rwanda |
250 |
GMT+02:00 |
|
|
Samoa (American) |
684 |
GMT-11:00 |
|
|
Samoa (Western) |
685 |
GMT-11:00 |
|
|
San Marino |
378 |
GMT+01:00 |
|
|
Sao Tome & Principe |
239 |
GMT |
|
|
Saudi Arabia |
966 |
GMT+03:00 |
|
|
Senegal |
221 |
GMT |
|
|
Serbia |
381 |
GMT+01:00 |
|
|
Seychelles |
248 |
GMT+04:00 |
|
|
Sierra Leone |
232 |
GMT |
|
|
Singapore |
65 |
GMT+08:00 |
|
|
Slovak Republic |
421 |
GMT+01:00 |
|
|
Slovenia |
386 |
GMT+01:00 |
|
|
Solomon Islands |
677 |
GMT+11:00 |
|
|
Somalia |
252 |
GMT+03:00 |
|
|
South Africa |
27 |
GMT+02:00 |
|
|
Spain |
34 |
GMT+01:00 |
|
|
Sri Lanka |
94 |
GMT+05:30 |
|
|
St Helena |
290 |
GMT |
|
|
St Kitts & Nevia |
869 |
GMT-04:00 |
|
|
St Lucia |
758 |
GMT-04:00 |
|
|
Sudan |
249 |
GMT+02:00 |
|
|
Surinam |
597 |
GMT-03:30 |
|
|
Swaziland |
268 |
GMT+02:00 |
|
|
Sweden |
46 |
GMT+01:00 |
|
|
Switzerland |
41 |
GMT+01:00 |
|
|
Syria |
963 |
GMT+02:00 |
|
|
Taiwan |
886 |
GMT+08:00 |
|
|
Tajikistan |
992 |
GMT+06:00 |
|
|
Tanzania |
255 |
GMT+03:00 |
|
|
Thailand |
66 |
GMT+07:00 |
|
|
The Gambia |
220 |
GMT |
|
|
Togo |
228 |
GMT |
|
|
Tonga |
676 |
GMT+13:00 |
|
|
Trinidad & Tobago |
868 |
GMT-04:00 |
|
|
Tunisia |
216 |
GMT+01:00 |
|
|
Turkey |
90 |
GMT+02:00 |
|
|
Turkmenistan |
993 |
GMT+05:00 |
|
|
Turks & Caicos Islands |
649 |
GMT-05:00 |
|
|
Tuvalu |
688 |
GMT+12:00 |
|
|
Uganda |
256 |
GMT+03:00 |
|
|
Ukraine |
380 |
GMT+03:00 |
|
|
United Arab Emirates |
971 |
GMT+04:00 |
|
|
United Kingdom |
44 |
GMT |
|
|
Uruguay |
598 |
GMT-03:00 |
|
|
USA |
1 |
GMT-05:00 |
GMT-11:00 |
|
Uzbekistan |
998 |
GMT+06:00 |
|
|
Vanuatu |
678 |
GMT+11:00 |
|
|
Venezuela |
58 |
GMT-04:00 |
|
|
Vietnam |
84 |
GMT+07:00 |
|
|
Wallis & Futuna Islands |
681 |
GMT+12:00 |
|
|
Yemen Arab Republic |
967 |
GMT+03:00 |
|
|
Zambia |
260 |
GMT+02:00 |
|
|
Zimbabwe |
263 |
GMT+02:00 |
|
Web Cache function in Network Gateway could cause internet service trouble
Have you ever had some bad experiences with internet connected device or internet based software? Please figure it out whether your case is just like this or not:
1) The device is working very well at other people’s home or other ISP, but it does not work at your home network.
2) IP address both public IP address checked from router and http://ip.kurapa.com is not the same.
If your case is the exactly the same like above, you need to contact network administrator.

<Picture: Network Gateway Specifications>
Web Cache feature is adding to Network Gateway since 2009 for network performance enhancement. Actually this is very good feature in terms of QoS(Quality of Service). So some ISPs are adopting Web Cache enabled Network Gateway. But some of the web cache function has a bug. The bug is causing service disability for OpenAPI based internet applications such as Google MAP, Twitter, Facebook, and something like that.
The simplest way to clear above problems is turning off the option (Web Cache). If your system is just like above, please contact ISP’s network administrator right now.
Drinking diet shakes during pregnancy
Diet shakes are intended to replace all or some portion of meals, with the goal of reducing calories. In and of themselves, they may make for a good “snack” during pregnancy, but they should not replace a well-balanced diet.
In general, dieting for weight loss is discouraged during pregnancy. The fetus needs a full supply of calories and nutrients for normal development. A balanced diet that allows for a total weight gain of about 30 to 35 pounds is usually sufficient for this.
The supplements added to many diet shakes present another safety issue. For example, the additional vitamin A in some shakes – on top of the amount in prenatal vitamins – may exceed the daily amount considered safe in pregnancy.
Using my microwave oven during pregnancy
The dangers of microwave radiation, much like the dangers of cell phone and other non-ionizing forms of radiation, have absolutely no basis in scientific fact. Microwave radiation can’t change the molecular structure of anything, because it simply doesn’t have enough energy to break apart chemical bonds.
To put it in perspective, plain old visible blue light has many, many times the energy of a microwave, and can break apart weak chemical bonds (this is what causes photochemical smog).
Microwaves actually have less energy than the infrared radiation (i.e. heat) that is given off by our bodies and the earth.
If you want to worry about radiation, worry about the small amount of UVB light that manages to reach the earth.
That radiation has enough energy to break apart the chemical bonds that make up DNA, causing cancer.
Microwaves, at 1/100,000 of the energy necessary to break apart chemical bonds, are closer to radio waves. All they do is heat up your food.
Taking vitamin C during pregnancy
Too much Vitamin C can cause cell damage in the fetus.
You should consume a normal amount of vitamin C when you’re pregnant. The recommended daily amount is 85 mg for pregnant women age 19 and older. The maximum is 2,000 mg per day.
If you’re taking prenatal vitamins, you’ll be getting vitamin C in that supplement. You’ll also get some from the food you eat. If you decide to take more, remember to keep the total under 2,000 mg.