json - bash: split pipe stream into records and combine all lines in a record into one -

i have file containing million individual xml files (simply concatenated) convert json. file looks this:

<amf xmlns="...">  <test>    1 content  </test> </amf> <amf xmlns="...">  <test>    2 content  </test> </amf>

note above file not formatted xml file (i.e. individual entries not nested), cannot convert using `xml2json'.

to achieve want separate file records, each record corresponds individual xml file, concatenate xml file 1 line, , use parallel on each line applying xml2json achieve json output.

when try use awk or gawk on osx, have trouble splitting pipe records. here's code tried ("useless" cat readability):

cat bigfile.xml | awk '{print nr "<amf xml"$0}' rs="<amf xml"

which gives:

1<amf xml 2<amf xmlns="...">  <test>    1 content  </test> </amf>  3<amf xmlns="...">  <test>    2 content  </test> </amf>

it's easy remove first 'record', can't collapse output of other records 1 line each record. tried experimenting fs="\n" , ofs=" " without luck.

can me output these records on 1 line per record?

with gnu awk multi-char rs , rt:

$ awk -v rs='</amf>\n' '{$1=$1; ors=rt}1' file <amf xmlns="..."> <test> 1 content </test></amf> <amf xmlns="..."> <test> 2 content </test></amf>

Search This Blog

WIKI

json - bash: split pipe stream into records and combine all lines in a record into one -

Comments

Post a Comment

Popular posts from this blog

qt - QML MouseArea onWheel event not working properly when inside QML Scrollview -

java - is not an enclosing class / new Intent Cannot Resolve Constructor -

python - Error importing VideoFileClip from moviepy : AttributeError: 'PermissionError' object has no attribute 'message' -