Posts

Showing posts from October, 2017

4- Moving Data Out of NiFi

Image
4- Moving Data Out of NiFi Objectives: Examine data in Queues from GetTwitter Create flow from GetTwitter to Local Folder Use Templates to Create Multiple Flows Examine data in Queues from GetTwitter Last time we successfully pulled data from a Twitter account into our local instance of NiFi.  We are going to pick up right where we left off, so the NiFi Flow should look something like this (I have removed the GenerateFlowFile to Log Processors): So if you run the GetTwitter Processor without turning on the LogAttribute Processor, you will see the Queue in the link between the processors fill up.  You'll notice in the above picture that I have 976 files queued. Using these Queues is a good way to look at the data before it reaches a processor, which will be useful when debugging a problem when moving files through multiple processors in a larger flow. To see a file in the Queue, right click on the Queue, and click on the "List queue" option.  You will...

3 - Getting Twitter To Talk to NiFi using Get Twitter

Image
Getting Twitter to Talk to NiFi using Get Twitter Objectives: Create a GetTwitter Processor in NiFi Set up Twitter to talk to NiFi Create a flow using GetTwitter Create a GetTwitter Processor in NiFi Welcome back! Today we are going to use NiFi's GetTwitter processor to create a flow of data into NiFi.  To start, we need to start our local NiFi instance and add a GetTwitter processor.  Your flow should look something like this: You'll notice two things - First that our GenerateFlowFile --> LogAttribute Flow is still there from the last time, which is good to see. Second - GetTwitter isn't linked to anything and won't be able to do anything until it's configured. Let's open the configuration, and go to the Properties Tab to see what we need: The first entry determines what type of Endpoint we are creating.  This will essentially determine if we want to see all Tweets in real time or if we want to pick which Twitter feeds to ...

2 - NiFi: UI, Processors, and Flow

Image
NiFi: UI, Processors, and Flow Objectives: NiFi Web UI Overview Create a processor and flow Start a flow NiFi Web UI Overview: First things first, fire up your local NiFi instance by running the run-nifi.bat file in the bin folder.  If you are new here, starting up the instance is detailed in the previous blog. Open up a web browser and access the NiFi UI on your local machine by going to this url: http://localhost:8080/nifi You should see your empty NiFi flow which is in the root Process Group.  A NiFi Flow is the whitish grid area.  This first flow is in the root Process Group, meaning everything is contained within this initial root flow.  Process Groups can be nested within each other, so we can make Process Groups within the root Process Group... Your screen should look like this: Blank NiFi UI On the top dark blue bar, you have several buttons that can be used to drag various items into your NiFi Flow.  Those items include: ...

1 - Getting Started with Apache NiFi

Image
Getting Started with Apache NiFi Objectives: Learn about history of NiFi Install NiFi Start up NiFi for first time on local machine NiFi Overview: The project was created by the United States' infamous National Security Agency (NSA) and had the codename "Niagarafiles".  In 2014 the NSA released it as open-source software.  The primary developer of the project so far has been Hortonworks who are also well known for Hadoop and their suite of data science software. At its core, Apache NiFi is used to manage data flow.  This means that its job is to take in data, process it, then send it along to another destination.  Think of it as a mail service for data.  One of the advantages that it has is that uses a web based interface that is simple and allows users to manage the data flow in real time. Installing NiFi: I am installing NiFi on my Windows 7 PC, installing it on a Mac or Linux OS will follow a different procedure.  Also of note: We w...

0 - Hello and Welcome

Welcome! Objectives of the blog as a whole: Learn more about the open source tool Apache NiFi  Use NiFi to query a Twitter feed for data Use NiFi to push that data to an endpoint for public display Gain knowledge on data science and data engineering Style of the blog: If you're following along with this blog, I hope that you will try to do the things you see in the posts.  With that said, I will do my best to explain how everything is done and make it as easy as possible for you, the reader, to recreate.  If you ever have any questions or comments along the way don't be afraid to leave a comment. One thing I have found when looking at forums/tutorials/blogs is when the author includes bits that don't actually work, or intentionally adds a mistake- I will not do this.  I will also do my best to compartmentalize and label the parts of a given solution.  Hopefully this blog will serve as a useful resource to you. The tool we will be using is ...