java - What is configuartion required to get data from object storage by SWIFT in Spark -


i go through document still confusing how data swift.

i configured swift in 1 linux machine. using below command able container list,

swift -a https://acc.objectstorage.softlayer.net/auth/v1.0/ -u username -k passwordkey list

i seen many blog blumix(https://console.ng.bluemix.net/docs/services/analyticsforapachespark/index-gentopic1.html#gentopprocid2) , written below code

sc.textfile("swift://container.myacct/file.xml") 

i looking integrate in java spark. need configure object storage credential in java code. there sample code or blog?

this notebook illustrates number of ways load data using scala language. scala runs on jvm. java , scala classes can freely mixed, no matter whether reside in different projects or in same. looking @ mechanics of how scala code interacts openstack swift object storage should guide craft java equivalent.

from above notebook, here steps illustrating how configure , extract data openstack swift object storage instance using stocator library using scala language. swift url decomposes into:

swift2d :// container . myacct / filename.extension   ^            ^          ^            ^ stocator     name of   namespace    object storage protocol     container               filename 

imports

import org.apache.spark.sparkcontext import scala.util.control.nonfatal import play.api.libs.json.json  val sqlctx = new sqlcontext(sc) val scplain = sqlctx.sparkcontext 

sample creds

// @hidden_cell var credentials = scala.collection.mutable.hashmap[string, string](   "auth_url"->"https://identity.open.softlayer.com",   "project"->"object_storage_3xxxxxx3_xxxx_xxxx_xxxx_xxxxxxxxxxxx",   "project_id"->"6xxxxxxxxxx04fxxxxxxxxxx6xxxxxx7",   "region"->"dallas",   "user_id"->"cxxxxxxxxxxaxxxxxxxxxx1xxxxxxxxx",   "domain_id"->"cxxxxxxxxxxaxxyyyyyyxx1xxxxxxxxx",   "domain_name"->"853255",   "username"->"admin_cxxxxxxxxxxaxxxxxxxxxx1xxxxxxxxx",   "password"->"""&m7372!fake""",   "container"->"notebooks",   "tenantid"->"undefined",   "filename"->"file.xml" ) 

helper method

def setremoteobjectstorageconfig(name:string, sc: sparkcontext, dsconfiguration:string) : boolean = {     try {         val result = scala.util.parsing.json.json.parsefull(dsconfiguration)         result match {             case some(e:map[string,string]) => {                 val prefix = "fs.swift2d.service." + name                 val hconf = sc.hadoopconfiguration                 hconf.set("fs.swift2d.impl","com.ibm.stocator.fs.objectstorefilesystem")                 hconf.set(prefix + ".auth.url", e("auth_url") + "/v3/auth/tokens")                 hconf.set(prefix + ".tenant", e("project_id"))                 hconf.set(prefix + ".username", e("user_id"))                 hconf.set(prefix + ".password", e("password"))                 hconf.set(prefix + "auth.method", "keystonev3")                 hconf.set(prefix + ".region", e("region"))                 hconf.setboolean(prefix + ".public", true)                 println("successfully modified sparkcontext object remote object storage credentials using datasource name " + name)                 println("")                 return true             }             case none => println("failed.")                 return false         }     }     catch {        case nonfatal(exc) => println(exc)            return false     } } 

load data

val setobjstor = setremoteobjectstorageconfig("sparksql", scplain, json.tojson(credentials.tomap).tostring) val data_rdd = scplain.textfile("swift2d://notebooks.sparksql/" + credentials("filename")) data_rdd.take(5) 

Comments

Popular posts from this blog

python - How to insert QWidgets in the middle of a Layout? -

python - serve multiple gunicorn django instances under nginx ubuntu -

module - Prestashop displayPaymentReturn hook url -