In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-09-23 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
Data processing flow chart of data platform
Data preparation:
It is mainly divided into several sources: FTP data sources, data pushed by partners, data obtained from Ctrip's open API interface, hotel management system log data and online travel agency website data sources. Data access:
According to the characteristics of multi-sources of data, a data access method for a specific scenario is developed.
Data from a.FTP sources: developed using shel scripts, including checking whether the data is ready, starting downloading, decrypting and unpacking, lzop compression, and uploading files to HDFS in put mode
b. Data pushed by partners: build a simple web service, accept requests pushed by Ctrip, use Nginx to complete the request load, and use Nginx to record the data in the request and write to the file. Later, the data can be obtained through the log collection system (in fact, the data can be pushed directly to Kafka from the partner)
c. Partner API interface data: the development program forms the producer-consumer model. The producer writes the task to the queue, and the consumer gets the task from the queue and uses the thread pool to concurrently obtain data from the partner API interface.
D.PMS log data: mainly done by open source Flume components
e. Website data: crawling website data with crawlers
3. Data storage:
There are two ways to store real-time data and offline data, which are stored through Kafka and HDFS respectively.
4. Data processing:
In the part of data processing, MapReduce and Spark are mainly used to develop data processing tasks.
5. Data query:
Hive is defined in the process of data query, and users query the data through Hive in the process of using the data platform.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
The market share of Chrome browser on the desktop has exceeded 70%, and users are complaining about
The world's first 2nm mobile chip: Samsung Exynos 2600 is ready for mass production.According to a r
A US federal judge has ruled that Google can keep its Chrome browser, but it will be prohibited from
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
About us Contact us Product review car news thenatureplanet
More Form oMedia: AutoTimes. Bestcoffee. SL News. Jarebook. Coffee Hunters. Sundaily. Modezone. NNB. Coffee. Game News. FrontStreet. GGAMEN
© 2024 shulou.com SLNews company. All rights reserved.