In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-09-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
Recently, a SQL has been running for more than two hours, so I'm going to optimize it.
First, check the counter data discovery of job generated by hive sql.
The total CPU time spent is overestimated by 100.4319973 hours
CPU time spent for each map
The first one took 2.0540889 hours.
It is recommended to set the following parameters:
1. Mapreduce.input.fileinputformat.split.maxsize is now 256000000 downwards to increase the number of maps (this move has an immediate effect. I set it to 32000000 to generate a 500 + map, and the final task is accelerated from 2 hours to 47 minutes.)
2. Optimize UDF getPageID getSiteId getPageValue (these methods use a lot of text matching of regular expressions)
2.1 regular expression processing optimization can refer to
Http://www.fasterj.com/articles/regex1.shtml
Http://www.fasterj.com/articles/regex2.shtml
2.2 UDF optimization see
1 Also you should use class level privatete members to save on object incantation and garbage collection.2 You also get benefits by matching the args with what you would normally expect from upstream. Hive converts text to string when needed, but if the data normally coming into the method is text you could try and match the argument and see if it is any faster. Exapmle: before optimization: > import org.apache.hadoop.hive.ql.exec.UDF; > import java.net.URLDecoder; > public final class urldecode extends UDF {> public String evaluate (final String s) {> if (s = = null) {return null;} > return getString (s); > public static String getString (String s) {> String a; > try {> a = URLDecoder.decode (s) >} catch (Exception e) {> a = ""; >} > return a; > public static void main (String args []) {> String t = "% E5%A4%AA%E5%8E%9F-%E4%B8%89%E4%BA%9A"; > System.out.println (getString (t)); >} >}
After optimization:
Import java.net.URLDecoder;public final class urldecode extends UDF {private Text t = new Text (); public Text evaluate (Text s) {if (s = = null) {return null;} try {t.set (URLDecoder.decode (s.toString (), "UTF-8")); return t;} catch (Exception e) {return null }} / / public static void main (String args []) {/ / String t = "% E5%A4%AA%E5%8E%9F-%E4%B8%89%E4%BA%9A"; / / System.out.println (getString (t)); / /}} 3 inherit to implement GenericUDF
3. If it is Hive 0.14 +, you can enable hive.cache.expr.evaluation UDF Cache function.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
The market share of Chrome browser on the desktop has exceeded 70%, and users are complaining about
The world's first 2nm mobile chip: Samsung Exynos 2600 is ready for mass production.According to a r
A US federal judge has ruled that Google can keep its Chrome browser, but it will be prohibited from
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
About us Contact us Product review car news thenatureplanet
More Form oMedia: AutoTimes. Bestcoffee. SL News. Jarebook. Coffee Hunters. Sundaily. Modezone. NNB. Coffee. Game News. FrontStreet. GGAMEN
© 2024 shulou.com SLNews company. All rights reserved.