top of page

Apache Spark and Python

  • Writer: GRoot
    GRoot
  • Dec 31, 2019
  • 1 min read

There is a package for Python that will give you access to Apache Spark.

First you need to install Spark. (A separate process all together)

Then Python, either directly or through Anaconda/MiniConda.

(I prefer the direct and MiniConda methods because Anaconda is so large and it sometimes gets behind in updating available packages.)

Once this is done you can install the Python package called PySpark.

Use the Conda console to run "Conda Install PySpark".

This will install the main package plus a Java support package, Py4J.

For a regular Python install you can use the PIP console and try, "PIP install PySpark".

This process may fail and you might have to install pypandoc with "PIP install pypandoc",

After that you should be able to run the PIP command to install PySpark.

 
 
 

Recent Posts

See All
PostgreSQL 13 now available!

Upgrade to the latest version of the PostgreSQL server. New security updates, etc. Don't let your software lag. PostgreSQL: Downloads

 
 
 
Extract audio from video files

When you want to extract the audio portion from a video file (non protected files only), you can use the open source software Audacity...

 
 
 
SharePoint CSS

When working in customizing a SharePoint site it is often necessary to "touch" deeply embedded objects with Cascading Style Sheets (CSS)....

 
 
 

Comments


Root Systems Development

bottom of page