Code I/O

A topnotch WordPress.com site


Leave a comment

5 Minutes on Java : Validating XML data against a Schema …

Taking the idea of XML Schema generation : click here to read my previous article.  The XML document can be validated against its schema before processing it.  This is critical, otherwise, the document will be processed without validation which will be a bad process.

To illustrate XML validation, I’m using Apache XML Beans as suggested in my previous blog entry.

Here is the XMLValidator code that will help in validating XML document against a schema.

import java.io.File;
import java.io.IOException;

import org.apache.xmlbeans.SchemaTypeLoader;
import org.apache.xmlbeans.XmlBeans;
import org.apache.xmlbeans.XmlException;
import org.apache.xmlbeans.XmlObject;
import org.apache.xmlbeans.XmlOptions;

public class XMLValidator {
	public boolean validate(File dataFile, File schemaFile) {
		boolean status = false;

		try {
			// Only one schema to validate it against
			XmlObject[] schemas = { XmlObject.Factory.parse(schemaFile,
					new XmlOptions().setLoadLineNumbers()
							.setLoadMessageDigest()) };

			SchemaTypeLoader loader = XmlBeans.compileXsd(schemas, null,
					new XmlOptions().setErrorListener(null)
							.setCompileDownloadUrls().setCompileNoPvrRule());

			XmlObject object = loader
					.parse(dataFile,
							null,
							new XmlOptions()
									.setLoadLineNumbers(XmlOptions.LOAD_LINE_NUMBERS_END_ELEMENT));

			status = object.validate();

			System.out.println("Validation Status: " + status);
		} catch (XmlException e1) {
			e1.printStackTrace();
		} catch (IOException e1) {
			e1.printStackTrace();
		}

		return status;
	}
}
Advertisements


Leave a comment

5 Minutes on Java : A Java API for obtaining OAUTH tokens with an embedded callback handler

If your Java applications require OAUTH callback, then  implementing it requires a Web/App container to host it.   However, for desktop application, it will be ideal to invoke an API which will implicitly launth an embedded HTTP server, process the tokens, and return it to the caller.   User of the API can consume the tokens without worrying about the callback handler.

Java-Scribe is required by this TokenGenerator.  Here is a brief usage:

public static void main(String[] args){
		final String APP_NAME = "tg";

		// Obtain AccessKey
		if(args.length >= 3){
			String providerName = args[0].trim();
			String key = args[1].trim();
			String secret = args[2].trim();

			String scope = null;

			if(args.length == 4) {
				scope = args[3].trim();
			}
			int timeout = 1000 * 12;

			if(args.length == 5){
				timeout = Integer.parseInt(args[4].trim());
			}

			System.out.println("app = " + providerName + "n" +
					"key = " + key + "n" +
					"secret = " + secret + "n" +
					"timeout = " + timeout);

			OAuthConnection connection = new OAuthConnection();
			connection.setApiKey(key);
			connection.setApiSecret(secret);
			connection.setApiProvider(providerName);
			connection.scopeCSV(scope);

			Provider provider = ProviderFactory.getEntryByProviderName(providerName.toLowerCase());

			if(provider == null) return;

			TokenGenerator tokenGenerator = new TokenGenerator(provider, connection, timeout);
			tokenGenerator.generateToken();

			Token accessToken = tokenGenerator.getAccessToken();
			if(accessToken != null){
				System.out.println("Access Token: " + accessToken.getToken());
				System.out.println("Access Secret: " + accessToken.getSecret());
			}
		}
		// Show Usage
		else {
			StringBuffer buffer = new StringBuffer();
			buffer.append("Incorrect usage of this application!!!n");
			buffer.append("USAGE: " + APP_NAME + " <provider> <key> <secret> [scope] [timeout]n");
			buffer.append("EXAMPLE: " + APP_NAME + " twitter YOUR_TWITTER_API_KEY YOUR_TWITTER_API_SECRETn");
			buffer.append("t " + APP_NAME + " facebook YOUR_FACEBOOK_API_KEY YOUR_FACEBOOK_API_SECRET read_stream,user_status,friends_status,offline_accessn");

			buffer.append("n");

			buffer.append("Supported applicationsn");
			for(String key : ProviderFactory.getProviderNames()){
				buffer.append("* " + key + "n");
			}

			System.out.println(buffer.toString());
			System.exit(1);
		}
	}

OAuthConnection is a POJO for fundamental attributes required for an authorization.

import java.util.ArrayList;
import java.util.List;

public class OAuthConnection extends Connection {
	private String apiProvider = null;
	private String apiKey = null;
	private String apiSecret = null;
	private List<String> apiScopeList = null;
	private String accessToken = null;
	private String accessSecret = null;

	public OAuthConnection(){
		this.authentication = Authentication.OAUTH;
	}

	public String scopeCSV() {
		if(apiScopeList != null && apiScopeList.size() > 0){
			String scope = apiScopeList.toString();

			scope = scope.replace("[", "");
			scope = scope.replace("]", "");
			scope = scope.replaceAll(" ", "");

			return scope;
		}

		return null;
	}

	public String toString(){
		String scope = scopeCSV();

		return connectionName + "t" + apiProvider + "t" + apiKey + "t" + apiSecret + "t" + ((scope != null) ? scope : "null")  + "t" + accessToken + "t" + accessSecret;
	}

	public void scopeCSV(String csvText) {
		if((csvText == null) || ((csvText != null) && (csvText.length() <= 0))){
			apiScopeList = null;
		}
		else {

			String[] values = csvText.split(",");
			apiScopeList = new ArrayList<String>();

			for(String value : values){
				apiScopeList.add(value);
			}

			System.out.println(connectionName + " SCOPE: " + apiScopeList.toArray().toString());
		}
	}

	public String getConnectionName() {
		return connectionName;
	}

	public void setConnectionName(String connectionName) {
		this.connectionName = connectionName;
	}

	public String getApiProvider() {
		return apiProvider;
	}

	public void setApiProvider(String apiProvider) {
		this.apiProvider = apiProvider;
	}

	public String getApiKey() {
		return apiKey;
	}

	public void setApiKey(String apiKey) {
		this.apiKey = apiKey;
	}

	public String getApiSecret() {
		return apiSecret;
	}

	public void setApiSecret(String apiSecret) {
		this.apiSecret = apiSecret;
	}

	public List<String> getApiScope() {
		return apiScopeList;
	}

	public void setApiScope(List<String> apiScope) {
		this.apiScopeList = apiScope;
	}

	public String getAccessToken() {
		return accessToken;
	}

	public void setAccessToken(String accessToken) {
		this.accessToken = accessToken;
	}

	public String getAccessSecret() {
		return accessSecret;
	}

	public void setAccessSecret(String accessSecret) {
		this.accessSecret = accessSecret;
	}
}

ProviderFactory is a registry for Providers, this is required to pre-package third-party provider information on which the user inputs can be validated.

import java.util.ArrayList;
import java.util.List;
import java.util.Map;
import java.util.TreeMap;
public class ProviderFactory {
	private static Map<String, Provider> entries = new TreeMap<String, Provider>();

	static {
		Provider provider = null;

		provider = new FacebookProvider();
		entries.put(provider.getApiProvider(), provider);

		provider = new GoogleProvider();
		entries.put(provider.getApiProvider(), provider);

		provider = new LinkedInProvider();
		entries.put(provider.getApiProvider(), provider);

		provider = new TwitterProvider();
		entries.put(provider.getApiProvider(), provider);

		provider = new FoursquareProvider();
		entries.put(provider.getApiProvider(), provider);

		System.out.println(entries);
	}

	public static Provider getEntryByProviderName(String key){
		return entries.get(key);
	}

	public static List<String> getProviderNames() {
		List<String> providers = new ArrayList<String>();

		for(String key : entries.keySet()){
			// if(key != null && key.trim().length() > 0){
				providers.add(key);
			// }
		}

		return providers;
	}
}

Here is a typical Provider interface, which must be implemented by each provider

import org.scribe.builder.api.Api;

// TODO: Refer to https://apigee.com/console/ for exploring API
public interface Provider {
	String getApiProvider();
	String getApiReference();
	ApiVersion getApiVersion();
	Class<? extends Api> getApiClass();
	List<String> getApiScopeList();
	List<RESTRequest> getApiRequestList();
}

Here is an AbstractProvider which can be extended.

import java.util.List;
import org.scribe.builder.api.Api;

public abstract class AbstractProvider implements Provider {
	protected ApiVersion apiVersion = ApiVersion.V1;

	protected String apiProvider = null;
	protected String apiReference = null;
	protected Class<? extends Api> apiClass = null;
	protected List<String> apiScopeList = null;
	protected List<RESTRequest> apiRequestList = null;

	public ApiVersion getApiVersion(){
		return apiVersion;
	}

	public String getApiProvider() {
		return apiProvider;
	}
	public void setApiProvider(String apiProvider) {
		this.apiProvider = apiProvider;
	}
	public String getApiReference() {
		return apiReference;
	}
	public void setApiReference(String apiReference) {
		this.apiReference = apiReference;
	}
	public Class<? extends Api> getApiClass() {
		return apiClass;
	}
	public void setApiClass(Class<? extends Api> apiClass) {
		this.apiClass = apiClass;
	}
	public List<String> getApiScopeList() {
		return apiScopeList;
	}
	public void setApiScopeList(List<String> apiScopeList) {
		this.apiScopeList = apiScopeList;
	}
	public List<RESTRequest> getApiRequestList() {
		return apiRequestList;
	}
	public void setApiRequestList(List<RESTRequest> apiRequestList) {
		this.apiRequestList = apiRequestList;
	}

	public AbstractProvider(){
		super();

		init();
	}

	protected abstract void init();
}

Here is an example implementation of GoogleProvider

import java.util.ArrayList;
import java.util.List;

public class GoogleProvider extends AbstractProvider {
	final static String ME = "me";
	final static String ORGANIZATION = "Informatica";
	final static String PUBLIC = "public";

	@Override
	protected void init() {
		// REFERENCE: http://code.google.com/apis/gdata/faq.html#AuthScopes
		String[] scopeArray = {
			"https://www.googleapis.com/auth/plus.me"
		};

		List<String> scopeList = new ArrayList<String>();
		for(String scope : scopeArray){
			scopeList.add(scope);
		}

		// REFERENCE: depends on what scope you need
		final String ACTIVITY_ID = "<activityId>";
		final String COMMENT_ID = "<commentId>";
		RESTRequest[] resourceArray = {
			// REFEREHCE : Google+ : https://developers.google.com/+/
			// People
			new RESTRequest("https://www.googleapis.com/plus/v1/people/%s", new String[] {ME}),
			new RESTRequest("https://www.googleapis.com/plus/v1/people?maxResults=20&query=%s", new String[] {ORGANIZATION}),
			new RESTRequest("https://www.googleapis.com/plus/v1/people/%s/activities/%s/?maxResults=100", new String[] {ME, PUBLIC}),

			// Activities
			new RESTRequest("https://www.googleapis.com/plus/v1/people/%s/activities/%s/?maxResults=100", new String[] {ME, PUBLIC}),
			new RESTRequest("https://www.googleapis.com/plus/v1/activities/%s", new String[] {ACTIVITY_ID}),
			new RESTRequest("https://www.googleapis.com/plus/v1/activities/?maxResults=20&query=%s", new String[] {ORGANIZATION}),

			// Comments
			new RESTRequest("https://www.googleapis.com/plus/v1/activities/%s/comments", new String[] {ACTIVITY_ID}),
			new RESTRequest("https://www.googleapis.com/plus/v1/comments/%s", new String[] {COMMENT_ID}),

		};

		// http://code.google.com/apis/gdata/faq.html#AuthScopes
		List<RESTRequest> requestList = new ArrayList<RESTRequest>();
		for(RESTRequest resource : resourceArray){
			requestList.add(resource);
		}

		// REFERENCE: "https://code.google.com/apis/console/
		this.apiProvider = "google";
		this.apiReference = "http://code.google.com/more/";
		this.apiClass = org.scribe.builder.api.GoogleApi.class;
		this.apiScopeList = scopeList;
		this.apiRequestList = requestList;
	}
}

Here is the listing for RESTRequest POJO which is extensively used

import java.util.Map;

import org.scribe.model.Verb;

public class RESTRequest {
	private Verb verb = null;
	private String url = null;

	private String parameters = null;
	private String payload = null;
	private Map<String, String> headers = null;

	private String[] variableValues = null;
	private boolean accessTokenRequired = true;

	public RESTRequest(String url){
		this(Verb.GET, url);
	}

	public RESTRequest(Verb verb, String url){
		this(verb, url, null);
	}

	public RESTRequest(String url, String[] variableValues){
		this(Verb.GET, url, variableValues);
	}

	public RESTRequest(Verb verb, String url, String[] variableValues){
		this(verb, url, true, variableValues);
	}

	public RESTRequest(String url, boolean accessTokenRequired, String[] variableValues) {
		this(Verb.GET, url, accessTokenRequired, variableValues);
	}

	public RESTRequest(Verb verb, String url, boolean accessTokenRequired, String[] variableValues){
		this.verb = verb;
		this.url = url;
		this.variableValues = variableValues;
		this.accessTokenRequired = accessTokenRequired;
	}

	public String toString(){
		return url;
	}

	public Verb getVerb() {
		return verb;
	}

	public void setVerb(Verb verb) {
		this.verb = verb;
	}

	public String getUrl() {
		return url;
	}

	public void setUrl(String url) {
		this.url = url;
	}

	public String[] getVariableValues() {
		return variableValues;
	}

	public void setVariableValues(String[] variableValues) {
		this.variableValues = variableValues;
	}

	public boolean isAccessTokenRequired() {
		return accessTokenRequired;
	}

	public void setAccessTokenRequired(boolean accessTokenRequired) {
		this.accessTokenRequired = accessTokenRequired;
	}

	public void setParameters(String parameters) {
		this.parameters = parameters;
	}

	public String getParameters() {
		return parameters;
	}

	public String getPayload() {
		return payload;
	}

	public void setPayload(String payload) {
		this.payload = payload;
	}

	public Map<String, String> getHeaders() {
		return headers;
	}

	public void setHeaders(Map<String, String> headers) {
		this.headers = headers;
	}
}


Leave a comment

5 Minutes on Java : Metro Web Services – solution for wsimport failure

There is a bug in wsimport (command line wrapper).  It throws an exception “Failed to load Main-Class manifest attribute …” because the wsimport script is faulty.

 

It is acknowledged as a bug, however, I didn’t see a fix in the latest build I downloaded.  Here are the modifications to the wrapper script (.bat) file to temporarily resolve this issue.

 

The key change is in line 1, JDK path and the METRO_LIB (classpath) changes in line 28.

 

set JAVA_HOME=<PATH_TO_JDK1.7>

rem
rem Infer METRO_HOME if not set
rem
if not "%METRO_HOME%" == "" goto CHECKJAVAHOME

rem Try to locate METRO_HOME
set METRO_HOME=%~dp0
set METRO_HOME=%METRO_HOME%..
if exist %METRO_HOME%libwebservices-tools.jar goto CHECKJAVAHOME

rem Unable to find it
echo METRO_HOME must be set before running this script
goto END

:CHECKJAVAHOME
if not "%JAVA_HOME%" == "" goto USE_JAVA_HOME

set JAVA=java
goto LAUNCH

:USE_JAVA_HOME
set JAVA="%JAVA_HOME%binjava"
goto LAUNCH

:LAUNCH
set METRO_LIB=%METRO_HOME%libstax-api.jar;%METRO_HOME%libwebservices-api.jar;%METRO_HOME%libwebservices-extra.jar;%METRO_HOME%libwebservices-extra-api.jar;%METRO_HOME%libwebservices-rt.jar;%METRO_HOME%libwebservices-tools.jar

REM %JAVA% -classpath %METRO_LIB% %WSIMPORT_OPTS% -jar "%METRO_HOME%libwebservices-tools.jar" %*
%JAVA% -classpath %METRO_LIB% %WSIMPORT_OPTS% com.sun.tools.ws.WsImport %*

:END
%COMSPEC% /C exit %ERRORLEVEL%


1 Comment

5 Minutes on Java : XML Schema generation

I was constantly using an online XSD generator … however, recently I found that the site was hacked 😦 which forced to home brew a similar solution.

One of the Apache’s projects XMLBeans exactly does what I needed : http://xmlbeans.apache.org/

You can install the binaries, and take the xmlbean.jar for invoking API for conversion.  Here is a XMLBean wrapper that can be reused.

import java.io.File;
import java.io.IOException;
import java.io.InputStream;
import java.io.StringWriter;

import org.apache.xmlbeans.XmlException;
import org.apache.xmlbeans.XmlObject;
import org.apache.xmlbeans.XmlOptions;
import org.apache.xmlbeans.impl.inst2xsd.Inst2Xsd;
import org.apache.xmlbeans.impl.inst2xsd.Inst2XsdOptions;
import org.apache.xmlbeans.impl.xb.xsdschema.SchemaDocument;

public class XMLBeans {
	public static void main(String[] args) {
		try {
			XMLBeans xmlBeans = new XMLBeans();
			SchemaDocument schemaDocument = xmlBeans.generateSchema(new File(
					"/Shared/test-data/test-data.xml"));

			StringWriter writer = new StringWriter();
			schemaDocument.save(writer, new XmlOptions().setSavePrettyPrint());
			writer.close();

			String xmlText = writer.toString();
			System.out.println(xmlText);

			System.out.println("Schemann" + xmlText);

		} catch (IOException e) {
			e.printStackTrace();
		} catch (Exception e) {
			e.printStackTrace();
		}
	}

	public SchemaDocument generateSchema(File inputFile) throws XmlException,
	IOException {
		return generateSchema(inputFile, XMLSchemaDesign.VENETIAN_BLIND);
	}

	public SchemaDocument generateSchema(File inputFile, XMLSchemaDesign design)
			throws XmlException, IOException {
		// Only 1 instance is required for now
		XmlObject[] xmlInstances = new XmlObject[1];
		xmlInstances[0] = XmlObject.Factory.parse(inputFile);

		return inst2xsd(xmlInstances, design);
	}

	public SchemaDocument generateSchema(InputStream is, XMLSchemaDesign design)
			throws XmlException, IOException {
		// Only 1 instance is required for now
		XmlObject[] xmlInstances = new XmlObject[1];
		xmlInstances[0] = XmlObject.Factory.parse(is);

		return inst2xsd(xmlInstances, design);
	}

	public SchemaDocument generateSchema(String input) throws XmlException,
	IOException {
		return generateSchema(input, XMLSchemaDesign.VENETIAN_BLIND);
	}

	public SchemaDocument generateSchema(String input, XMLSchemaDesign design)
			throws XmlException, IOException {
		// Only 1 instance is required for now
		XmlObject[] xmlInstances = new XmlObject[1];
		xmlInstances[0] = XmlObject.Factory.parse(input);

		return inst2xsd(xmlInstances, design);
	}

	private SchemaDocument inst2xsd(XmlObject[] xmlInstances,
			XMLSchemaDesign design) throws IOException {
		Inst2XsdOptions inst2XsdOptions = new Inst2XsdOptions();
		if (design == null || design == XMLSchemaDesign.VENETIAN_BLIND) {
			inst2XsdOptions.setDesign(Inst2XsdOptions.DESIGN_VENETIAN_BLIND);
		} else if (design == XMLSchemaDesign.RUSSIAN_DOLL) {
			inst2XsdOptions.setDesign(Inst2XsdOptions.DESIGN_RUSSIAN_DOLL);
		} else if (design == XMLSchemaDesign.SALAMI_SLICE) {
			inst2XsdOptions.setDesign(Inst2XsdOptions.DESIGN_SALAMI_SLICE);
		}

		SchemaDocument[] schemaDocuments = Inst2Xsd.inst2xsd(xmlInstances,
				inst2XsdOptions);
		if (schemaDocuments != null && schemaDocuments.length > 0) {
			return schemaDocuments[0];
		}

		return null;
	}
}


Leave a comment

5 Minutes on Java : JSON to XML

You’ve hit this page because you are looking for this utility as I did today.  I wanted to convert JSON to XML in an easier way.  Thanks to http://openjsan.org/src/k/ka/kawasaki, who has implemented XML.ObjTree for performing JSON2XML conversion very elegantly.  All, I needed to do was to put in a Java wrapper.  Here is the XMLObjTree.java, a java wrapper for Kawasaki’s implementation.

import java.io.File;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;

import javax.script.ScriptEngine;
import javax.script.ScriptEngineManager;
import javax.script.ScriptException;

// A Java wrapper for XML.ObjTree
// Thanks to - http://openjsan.org/src/k/ka/kawasaki/XML.ObjTree-0.24/README
public class XMLObjTree {
    public static ScriptEngine getJsengine() {
		return jsEngine;
	}

	private static ScriptEngine jsEngine;

    static {
        try {
        	jsEngine = new ScriptEngineManager().getEngineByExtension("js");

        	InputStream is = XMLObjTree.class.getResourceAsStream("/scripts/js/XML/ObjTree.js");
        	InputStreamReader reader = new InputStreamReader(is);

        	jsEngine.eval(reader);

			is.close();
			reader.close();
		} catch (ScriptException e) {
			e.printStackTrace();
		} catch (IOException e) {
			e.printStackTrace();
		}
    }

    public String jsonToXml(final File jsonFile) throws FileNotFoundException, IOException {
    	return (jsonToXml(StreamUtils.streamToString(jsonFile)));
    }

    public String jsonToXml(final InputStream is){
    	String input = StreamUtils.streamToString(is);

    	return (jsonToXml(input));
    }

    public String jsonToXml(final String jsonText) {
        try{
            return (String) jsEngine.eval("(new XML.ObjTree()).writeXML(" + jsonText + ");");
        }
        catch(ScriptException ex) {
            throw new RuntimeException(ex);
        }
    }
}

Here is the StreamUtils  you’ll need

import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.io.OutputStreamWriter;
import java.io.Reader;
import java.io.Writer;

public class StreamUtils {
	public static String streamToString(InputStream is) {
		try
		{
			final char[] buffer = new char[4096];
			StringBuilder out = new StringBuilder();

			Reader in = new InputStreamReader(is, "UTF-8");

			int read;
			do
			{
				read = in.read(buffer, 0, buffer.length);
				if (read > 0)
				{
					out.append(buffer, 0, read);
				}
			} while (read >= 0);

			in.close();

			return out.toString();
		} catch (IOException ioe)
		{
			throw new IllegalStateException("Error while reading input stream", ioe);
		}
	}

	public static String streamToString(File file) throws FileNotFoundException, IOException {
		FileInputStream fis = new FileInputStream(file);

		String string = streamToString(fis);
		fis.close();

		return string;
	}

	public static void write(final String data, final Writer writer) throws IOException{
		writer.write(data);

		writer.flush();

		writer.close();
	}

	public static void write(final String data, final File file) throws IOException{
		try{
			FileOutputStream fos = new FileOutputStream(file);

			Writer writer = new OutputStreamWriter(fos, "UTF-8");
			write(data, writer);

			fos.close();
		} catch(Exception e) {
			e.printStackTrace();
		} finally {
		}
	}
}

Let’s put it to use using some sample JSON text

    public static void main(String[] args){
		final String filePath = "/Shared/test-data/test-data.json";
		final File jsonFile = new File(filePath);

		try {
			XMLObjTree xmlObjTree = new XMLObjTree();

			System.out.println("XMLnn" + xmlObjTree.jsonToXml(jsonFile));
		} catch (Exception e) {
			e.printStackTrace();
		}
	}


3 Comments

SocialMedia : Facebook Real Time subscriptions …

There are couple of things required to work with Facebook realtime updates.  I had a tough time in understanding what they were trying to do, but then, after few reading and experimenting, it got easy; you can do it too … here are few steps that can be tried using a REST-Client plug-in (Chrome/IE/Firefox).

First: You need to ask Facebook for an access token.  Remember, this is NOT the same access token that you will use with Graph API.  However, one would have generated this using the OAuth get access token process.

https://graph.facebook.com/oauth/access_token?client_id=<CLIENT_ID>&client_secret=<CLIENT_SECRET>&type=client_cred

This will get you the access token in requests response.

access_token=<ACCESS_TOKEN_FROM_FACEBOOK>


You will use this access token in your requests to find out about your subscriptions, this is a http GET request

https://graph.facebook.com/<CLIENT_ID>/subscriptions?access_token=<OBTAINED_ACCESS_TOKEN>


This should generate a response such as this, which is a proof that your request was successfully generated.

{    "data": [] }

Once you find out that you’re able to make a contact, the next step is to ask Facebook to push interested objects to your REST server.  To do this, you need to make a request to Facebook with information about where Facebook must push.  This time, the same request will be sent as a POST with data fields filled up.

https://graph.facebook.com/<CLIENT_ID>/subscriptions?access_token=<OBTAINED_ACCESS_TOKEN>

The data fields that you must give to Facebook are as follows: [Mandatory]

object=user&fields=name,feed,friends&callback_url=http://<YOUR_PUBLIC_HOST>:<YOUR_PUBLIC_PORT>/facebook/realtime_callback&verify_token=SOME_RANDOM_TEXT

Here is a list of supported entities and its meaning

Entity                        Meaning
-----------------------------------------------------------------------------
object                        Objects that you're interested to subscribe
fields                        The fields which Facebook must send back when an update occurs
callback_url                  A publicly hosted service URL to which Facebook will push real-time updates
verify_token                  A random text that you'll send to Facebook, which will be returned back to you, this is to avoid spammers.

Here is the code for HTTP service which will handle notifications from Facebook.

enum Verb { GET, POST, PUT, DELETE };
final String verifiyToken = UUID.randomUUID().toString();
final String CALLBACK_CONTEXT = "/facebook/realtime_callback";
HttpServer server = null;
// create an instance of the lightweight HTTP server on port
try {
	server = HttpServer.create(new InetSocketAddress(PORT), 0);
} catch (IOException e) {
	System.out.println("IOException while launching HttpServer " + e.getMessage());
}
server.createContext(CALLBACK_CONTEXT, new HttpHandler() {

	public void handle(HttpExchange exchange) throws IOException {
		final String requestMethod = exchange.getRequestMethod();

		final String query = exchange.getRequestURI().getQuery();

		URLParams urlParams = new URLParams(query);

		OutputStream os = exchange.getResponseBody();
		Headers responseHeaders = exchange.getResponseHeaders();

		// List subscriptions
		if(requestMethod.compareToIgnoreCase(Verb.GET.name()) == 0){
			System.out.println("GET Request");

			String hubMode = urlParams.getParam("hub.mode");
			String hubVerifyToken= urlParams.getParam("hub.verify_token");

			if((hubMode.compareTo("subscribe")) == 0 && (hubVerifyToken.compareTo(verifiyToken) == 0)) {
				exchange.sendResponseHeaders(200, 0);
				responseHeaders.add("ContentType", "text/plan");

				String hubChallenge = urlParams.getParam("hub.challenge");

				if(hubChallenge != null) {
					os.write(hubChallenge.getBytes());
				}
			}
		}
		// Add/Modify subscription
		else if(requestMethod.compareToIgnoreCase(Verb.POST.name()) == 0){
			System.out.println("POST Request");

			// TODO: Read the stream
			InputStream is= exchange.getRequestBody();
			try {
				System.out.println("Response Received: " + streamToString(is));
			} catch (Exception e) {
				e.printStackTrace();
			}
		}
		// Delete subscription
		else if(requestMethod.compareToIgnoreCase(Verb.DELETE.name()) == 0){
			System.out.println("DELETE Request");
		}

		os.close();
	}
});
// creates a default executor
server.setExecutor(null);
// start the server
System.out.println("Embeded Facebook Realtime callback services started!!");
server.start();

Here is the code which will convert a stream to string

public static String streamToString(InputStream is) {
	try
	{
		final char[] buffer = new char[4096];
		StringBuilder out = new StringBuilder();
		Reader in = new InputStreamReader(is, "UTF-8");
		int read;
		do
		{
			read = in.read(buffer, 0, buffer.length);
			if (read > 0)
			{
				out.append(buffer, 0, read);
			}
		} while (read >= 0);

		in.close();

		return out.toString();
	} catch (IOException ioe)
	{
		throw new IllegalStateException("Error while reading input stream", ioe);
	}
}

For more details about what Facebook will send you and how you must handle, refer to real time API docs.

https://developers.facebook.com/docs/reference/api/realtime/


1 Comment

5 Minutes on Hadoop and Hive : Get boot-strapped with BigData

What is interesting for this blog entry is the use of HDFS and Hive layers for processing high volume data files.

I wish I had enough processing power on my laptop to clone the VM and emulate a 2-4 node cluster, I used Ubuntu Linux (64bit VM)

Installed Hadoop (http://www.apache.org/dyn/closer.cgi/hadoop/common/) and Hive (http://www.apache.org/dyn/closer.cgi/hive/)

Once installed, need to set few environmental variables to make things easier to work:

export JAVA_HOME=<PATH_TO_JDK>
export HIVE_HOME=<PATH_TO_HIVE>
export HADOOP_HOME=<PATH_TO_HADOOP>

Once you’ve extracted the Hadoop and Hive files into the above path; you need to edit the configuration files to add your node information. You’ll have to edit the following files in $HADOOP_HOME/conf/


hadoop-sites.xml

mapred-site.xml

hdfs-site.xml core-site.xml

to add the following to the <configuration> element; this process must be repeated on all nodes participating in the cluster. Make sure that host name is resolvable; if not make appropriate entries in /etc/hosts.

<property>
	<name>fs.default.name</name>
	<value>hdfs://localhost:9000</value>
</property>
<property>
	<name>dfs.datanode.address</name>
	<value>localhost:50090</value>
</property>
<property>
	<name>dfs.datanode.http.address</name>
	<value>localhost:50075</value>
</property>
<property>
<name>mapred.job.tracker</name>
	<value>hdfs://localhost:9001</value>
	</property>
<property>
	<name>dfs.replication</name>
	<value>1</value>
</property>

Add these variables to PATH

export PATH=JAVA_HOME/bin:HIVE_HOME/bin:HADOOP_HOME/bin:$PATH

Edit the following files to customise the environment

edit $HADOOP_HOME/conf/hadoop-env.sh
uncomment line containing JAVA_HOME and set value according to your environment.

edit $HIVE_HOME/conf/hive-default.xml
look for key : hive.metastore.warehouse.dir
modify its value to a path where you want to store the hive files.

NOTE: Use all the commands as a non-root user.

The next thing you would like to do is format the HDFS to be used.

$ hadoop namenode -format

This formats the filesystem and prepares the HDFS to be used. Since HDFS is literally a file system, you need to upload the files that needs to be processed. Once the files are on a common path across all the nodes, the MR (Map-Reduce) algo can be written and deployed to get the job done.

$ $HADOOP_HOME/start-all.sh

Starts hadoop services

Now, HIVE just simplifies all of that process using SQL interface, and eliminates me from implementing the MR job.

To check this out, I used the following CSV file from http://explore.data.gov/Foreign-Commerce-and-Aid/U-S-Overseas-Loans-and-Grants-Greenbook-/5gah-bvex
The file can be downloaded and used for this sample.

Now lanuch HIVE, and execute all the commands in HIVE prompt

$ hive
hive> CREATE TABLE usa_gov_data (country STRING, comments STRING, year1 INT, year2 INT, year3 INT, year4 INT)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.dynamic_type.DynamicSerDe'
WITH SERDEPROPERTIES (
'serialization.format'='org.apache.hadoop.hive.serde2.thrift.TCTLSeparatedProtocol',
'quote.delim'='("|\[|\])',
'field.delim'=',',
'serialization.null.format'='-')
STORED AS TEXTFILE;

When command is executed, you will see the following messages on successful execution.

OK
Time taken: 7.879 seconds

The CSV file contains 2500 rows and 60 columns.  For simplicity of demonstration; I've created a table which accommodates the first 6 columns)
The serialization.format and serialization/deserialization classes are explicitly mentioned.  These helper classes will performing the parsing of the file during load.
quote.delim : parses out quotes in the csv file.
field.delim : change this to any delimiter that your file contains (don't forget to escape it)

To list the tables that you’ve created

hive> show tables;
 OK
 usa_gov_data
 Time taken: 0.156 seconds

Now it’s time to upload the content:

hive> LOAD DATA LOCAL INPATH '<PATH_TO>/us_economic_assistance.csv' OVERWRITE INTO TABLE usa_gov_data;
Copying data from file:/home/innovator/Downloads/us_economic_assistance.csv
Copying file: file:/home/innovator/Downloads/us_economic_assistance.csv
Loading data to table default.usa_gov_data
Deleted hdfs://localhost:9000/user/hive/warehouse/usa_gov_data
OK

Time taken: 0.469 seconds

Once the file is uploaded, we can run the queries. (If you query the HDFS, you will be able to find the whole file as it is in the location). which implies, HDFS is just a placeholder for all objects; and it is upto the MR algo to process it while applying the query techniques.

hive> select * from usa_gov_data where country = 'Afghanistan';
Total MapReduce jobs = 1
Launching Job 1 out of 1

Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_201107012339_0001, Tracking URL = http://localhost:50030/jobdetails.jsp?jobid=job_201107012339_0001
Kill Command = /home/innovator/apps/hadoop/bin/../bin/hadoop job  -Dmapred.job.tracker=hdfs://localhost:9001 -kill job_201107012339_0001
2011-07-01 23:41:31,068 Stage-1 map = 0%,  reduce = 0%
2011-07-01 23:41:36,215 Stage-1 map = 100%,  reduce = 0%
2011-07-01 23:41:39,244 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_201107012339_0001
OK

Here is a snapshot of the results


Afghanistan Child Survival and Health 0 0 0 0
Afghanistan Department of Defense Security Assistance 0 0 0 0
Afghanistan Development Assistance 0 0 0 0
Afghanistan Economic Support Fund/Security Support Assistance 0 0 0 0
Afghanistan Food For Education 0 0 0 0
Afghanistan Global Health and Child Survival 0 0 0 0
Afghanistan Inactive Programs 0 0 0 0
Afghanistan Migration and Refugee Assistance 0 0 0 0
Afghanistan Narcotics Control 0 0 0 0
Afghanistan Nonproliferation, Anti-Terrorism, Demining and Related 0 0 0 0
Afghanistan Other Active Grant Programs 0 0 0 0
Afghanistan Other State Assistance 0 0 0 0
Afghanistan Other USAID Assistance 0 0 0 0
Afghanistan Other USDA Assistance 0 0 0 0
Afghanistan Peace Corps 0 0 0 0
Afghanistan Section 416(b)/ Commodity Credit Corporation Food for Progress 0 0 0 0
Afghanistan Title I 0 0 0 0
Afghanistan Title II 0 0 0 0
Time taken: 16.61 seconds

Just in case if you want to get rid of the table you created …

hive> drop table usa_gov_data;

There goes an end-to-end demo on how Hadoop and Hive can be leveraged. Now imagine a high-volume-data scenario where there are loads of files sitting in a stack of HDFS nodes. HIVE will federate the query across to all the nodes participating in the cluster and obtain back an aggregated view of data, which is really a powerful tool for distributed computing.

However, the detailed task is to find out implementation of the parsing technique, which is key for processing objects; understand that well.