In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-09-07 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
Before learning any technology of spark, please correctly understand spark, you can refer to: correct understanding of spark
The following is a configuration of the environment for developing spark with python on the mac operating system
First, install python
Spark2.2.0 requires the version of python to be Python2.6+ or Python3.4+
Please refer to:
Http://jingyan.baidu.com/article/7908e85c78c743af491ad261.html
Download the spark compiler and configure the environment variables
1. On the official website: http://spark.apache.org/downloads.html download version: spark-2.2.0-bin-hadoop2.6.tgz package
Put it on a local disk and decompress it.
2. Set environment variables:
Cd ~
Vi .bash _ profile
Export SPARK_HOME=/Users/tangweiqun/Desktop/bigdata/spark/spark-2.2.0-bin-hadoop2.6
Export PATH=$PATH:$SCALA_HOME/bin:$M2_HOME/bin:$JAVA_HOME/bin:$SPARK_HOME/bin
Source .bash _ profile
3. You need to execute chmod 744. / * on the files in the bin directory under SPARK_HOME, otherwise an error of insufficient permissions will be reported.
Window machines should not have to do this.
Third, install PyCharm
1. Download it from the official website: https://www.jetbrains.com/pycharm/download/ and install it foolishly.
4. Write wordcount.py and run it successfully
1. Create a project
File-- > New Project
2. Configure PYTHONPATH for PyCharm
Run-- > Edit Configurations, the configuration is as follows
Click the "+" above, and then fill in:
PYTHONPATH=/Users/tangweiqun/Desktop/bigdata/spark/spark-2.1.0-bin-hadoop2.6/python/:/Users/tangweiqun/Desktop/bigdata/spark/spark-2.1.0-bin-hadoop2.6/python/lib/py4j-0.10.4-src.zip
Add the python-related dependencies in the spark installation package
3. Py4j-some-version.zip and pyspark.zip are added to the project
In order to see the source code, we need to associate the project with the source code as follows:
Click + Add Content Root to add two zip packages under / Users/tangweiqun/Desktop/bigdata/spark/spark-2.1.0-bin-hadoop2.6/python/lib
4. Write spark word count and run it successfully
Create a python file, wordcount.py, with the following contents:
From pyspark import SparkContext, SparkConf
Import os
Import shutil
If _ name__ = = "_ _ main__":
Conf = SparkConf () .setAppName ("appName") .setMaster ("local")
Sc = SparkContext (conf=conf)
SourceDataRDD = sc.textFile ("file:///Users/tangweiqun/test.txt")
WordsRDD = sourceDataRDD.flatMap (lambda line: line.split ())
KeyValueWordsRDD = wordsRDD.map (lambda s: (s, 1))
WordCountRDD = keyValueWordsRDD.reduceByKey (lambda a, b: a + b)
OutputPath = "/ Users/tangweiqun/wordcount"
If os.path.exists (outputPath):
Shutil.rmtree (outputPath)
WordsRDD.saveAsTextFile ("file://" + outputPath)
Print wordCountRDD.collect ()
Right click to run successfully
Detailed and systematic understanding of spark core RDD-related Api can be referred to: detailed explanation of spark core RDD api principle
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
The market share of Chrome browser on the desktop has exceeded 70%, and users are complaining about
The world's first 2nm mobile chip: Samsung Exynos 2600 is ready for mass production.According to a r
A US federal judge has ruled that Google can keep its Chrome browser, but it will be prohibited from
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
About us Contact us Product review car news thenatureplanet
More Form oMedia: AutoTimes. Bestcoffee. SL News. Jarebook. Coffee Hunters. Sundaily. Modezone. NNB. Coffee. Game News. FrontStreet. GGAMEN
© 2024 shulou.com SLNews company. All rights reserved.