Press J to jump to the feed. Press question mark to learn the rest of the keyboard shortcuts
Log In
Found the internet!
Apache Spark
Posts
Communities

Posts about Apache Spark

r/apachespark
11.0k members
Articles and discussion regarding anything to do with Apache Spark.
Visit
Subreddit Icon
r/scala
48.9k members
Welcome to r/scala
Visit
Subreddit Icon
r/dataengineering
92.1k members
News & discussion on Data Engineering topics, including but not limited to: data pipelines, databases, data formats, storage, data modeling, data governance, cleansing, NoSQL, distributed systems, streaming, batch, Big Data, and workflow engines.
Visit
r/bigdata
48.4k members
Welcome to r/bigdata
Visit
Subreddit Icon
r/datascience
854k members
A place for data science practitioners and professionals to discuss and debate data science career questions.
Visit
Subreddit Icon
r/programming
5.2m members
Computer Programming
Visit
Subreddit Icon
r/MachineLearning
2.6m members
Welcome to r/MachineLearning
Visit
Subreddit Icon
r/Python
1.1m members
News about the programming language Python. If you have something to teach others post here. If you have questions or are a newbie use r/learnpython
Visit
Subreddit Icon
r/aws
230k members
News, articles and tools covering Amazon Web Services (AWS), including S3, EC2, SQS, RDS, DynamoDB, IAM, CloudFormation, Route 53, CloudFront, Lambda, VPC, Cloudwatch, Glacier and more.
Visit
r/learnmachinelearning
286k members
A subreddit dedicated to learning machine learning
Visit
126
Subreddit Icon
Posted by2 days ago

I am checking stack overflow survey 2022 (https://survey.stackoverflow.co/2022/#top-paying-technologies-other-frameworks-and-libraries), and I see that apache spark is the highest paying framework under other frameworks category. I want to upskill myself and to be more demanded in job market. So is it worth learning Apache Spark (PySpark) in 2023?

126
67 comments
17
Subreddit Icon
Posted by1 month ago
17
16 comments
84
Subreddit Icon
Posted by3 months ago
84
55 comments
11
Posted by11 days ago

The awesome-spark repo has a list of Spark OSS libraries, but a lot of them are quite old.

I am thinking about curating another list that's a bit more focused and updated. For example, I'm interested in knowing all the good data validation libraries that are currently being maintained for PySpark right now. It's hard to figure out the best options.

Feel free to add the libraries you like a lot in the comments and I'll try to collate a list.

11
1 comment
147
Subreddit Icon
Posted by3 months ago
147
16 comments
12
Posted by3 months ago
12
13 comments