{"id":6,"date":"2014-05-28T00:47:59","date_gmt":"2014-05-27T15:47:59","guid":{"rendered":"http:\/\/b.openlook.org\/b\/?p=6"},"modified":"2017-06-02T00:42:30","modified_gmt":"2017-06-01T15:42:30","slug":"merging-multiple-bgzf-files-very-fast","status":"publish","type":"post","link":"https:\/\/openlook.org\/wp\/merging-multiple-bgzf-files-very-fast\/","title":{"rendered":"BGZF \ud30c\uc77c \uc5ec\ub7ec \uac1c \ud569\uce58\uae30"},"content":{"rendered":"<ul>\n<li>bed \ud30c\uc77c\uc5d0 2\uc5b5 \uc904 \uc9dc\ub9ac \ub370\uc774\ud130\uac00 \ub4e4\uc5b4\uc788\ub2e4.<\/li>\n<li>\ub300\ucda9 \ud30c\uc774\uc36c\uc73c\ub85c \uacc4\uc0b0\ud560 \uac78 \ub9cc\ub4e4\uc5b4\ubcf4\ub2c8 1\ubc88 \uc5fc\uc0c9\uccb4 \ucc98\ub9ac\uc5d0\ub9cc \uc2f1\uae00 \ud504\ub85c\uc138\uc2a4\ub85c \ud558\ub8e8 \uc815\ub3c4 \uac78\ub838\ub2e4.<\/li>\n<li>\uc774\ub7f0 \ub370\uc774\ud130\uac00 \ud558\ub098\ub9cc \uc788\ub294\uac8c \uc544\ub2c8\ub77c \uacc4\uc18d \ub098\uc628\ub2e4.<\/li>\n<\/ul>\n<p>\uc790 \uc774\ub7f0 \uc0c1\ud669\uc774 \uc788\uc73c\uba74 \uc608\uc804 \uac19\uc73c\uba74 &#8220;C\ub85c \uc798 \uc9e0\ub2e4&#8221;\ub77c\ub358\uc9c0 &#8220;\ud30c\uc77c\uc744 \ub098\ub220\uc11c \ud074\ub7ec\uc2a4\ud130\uc5d0 \ub123\ub294\ub2e4.&#8221; \uc815\ub3c4\uac00 \ub2f5\uc774\ub2e4. <a href=\"http:\/\/www.gnu.org\/software\/parallel\/\">GNU parallel<\/a> \uac19\uc740 \uac83 \uc368\ubcf4\ub824\uace0 \ub178\ub825\uc744 \ud574 \ubcfc \uc218\ub3c4 \uc788\uace0.<\/p>\n<p>\uc804\uc5d0\ub3c4 \ube14\ub85c\uadf8\uc5d0\uc11c \uc18c\uac1c\ud55c \uc801\uc774 \uc788\ub294\ub370, \uc774\ub7f0 \uc77c\uc5d0 <a href=\"http:\/\/samtools.sourceforge.net\/tabix.shtml\">tabix<\/a>\ub97c \uc4f0\uba74 \uc6ec\ub9cc\ud55c \ubb38\uc81c\ub294 \ub2e8\ubc88\uc5d0 \ud480\ub9b0\ub2e4! <del datetime=\"2014-02-24T14:30:02+00:00\">\uc624\uc624 \uc704\ub300\ud558\uc2e0 Heng Li \ub290\ub2d8&#8230;<\/del> \ud14d\uc2a4\ud2b8\ub97c \uc555\ucd95\ud574 \ub193\uace0 \uc911\uac04\ubd80\ud130 \ub79c\ub364 \uc561\uc138\uc2a4\uac00 \uac00\ub2a5\ud574\uc838\uc11c \ud30c\uc77c\uc744 \ub098\ub20c \ud544\uc694\uac00 \uc5c6\ub2e4\ubcf4\ub2c8, \ud30c\uc77c\uc744 \ub098\ub204\uac70\ub098 parallel\uc744 \uc4f8 \ub54c \uc5c4\uccad\ub098\uac8c \ub0ad\ube44\ub418\ub294 I\/O \uc2dc\uac04\uc774 \uc808\uc57d\ub41c\ub2e4. \ubb3c\ub860 \uacb0\uacfc\ub3c4 tabix\uc5d0\uc11c \uc4f0\ub294 bgzf\ub85c \ubbf8\ub9ac \uc555\ucd95\ud574\uc11c \uc800\uc7a5\ud558\uba74, \uadf8\ub0e5 \uc774\uc5b4\ubd99\uc774\uae30\ub9cc \ud558\uba74 \uc544\ubb34 \ubb38\uc81c \uc5c6\uc774 \ub2e4\uc2dc \ub2e4\uc74c \uacfc\uc815\ub3c4 \ub610 \ubd84\uc0b0\ud574\uc11c \ucc98\ub9ac\uac00 \uac00\ub2a5\ud574\uc9c4\ub2e4. \ud3ec\ub9f7\ub3c4 bed\ub9cc \ub418\ub294\uac8c \uc544\ub2c8\ub77c \ud0ed\uc73c\ub85c \uad6c\ubd84\ub41c \ub370\uc774\ud130\ub97c \uc815\ub82c\ud558\uae30\ub9cc \ud558\uba74 \ub41c\ub2e4.<\/p>\n<p>\uadf8\ub7f0\ub370 \uc774\uac8c \uc774\ub860\uc801\uc73c\ub85c\ub294 \uadf8\ub0e5 \uc774\uc5b4\ubd99\uc774\uae30\ub9cc \ud558\uba74 \uc798 \ub3fc\uc57c\ud558\uc9c0\ub9cc, \uc758\uc678\ub85c \uc798 \uc548 \ub41c\ub2e4. \uc5b4\ub5bb\uac8c \uc774\uc5b4 \ubd99\uc5ec\ub3c4 \uccab \ubc88\uc9f8 \ud30c\uc77c\ub9cc \uc778\uc2dd\ud558\uace0 \ub098\uba38\uc9c0\ub294 \ubb34\uc2dc\ud55c\ub2e4. \uadf8\ub798\uc11c \uc555\ucd95\uc744 \ud480\uc5c8\ub2e4\uac00 \ub2e4\uc2dc \uc555\ucd95\uc744 \ud558\ub294 \ubc29\ubc95\uc73c\ub85c <del datetime=\"2014-05-28T13:33:54+00:00\">\uad6c\uc9c8\uad6c\uc9c8\ud558\uac8c<\/del> \ud574 \ubd24\uc9c0\ub9cc, \uba40\ud2f0\uc4f0\ub808\ub4dc \uc9c0\uc6d0\ud558\ub294 <a href=\"https:\/\/github.com\/nh13\/samtools\/tree\/master\/pbgzip\">pbgzip<\/a>\uc740 \ub79c\ub364\ud55c \ubc84\ud37c \uc5d0\ub7ec\uac00 \uc218\uc2dc\ub85c \ub09c\ub2e4. (\uace0\uce58\ub824\uace0 \ud574\ubd24\uc9c0\ub9cc \uc2e4\ud328 orz)<\/p>\n<p>\uadf8\ub798\uc11c \uc774\ub9ac \uc800\ub9ac \uace0\ubbfc\uc744 \ud574 \ubcf4\ub2e4\uac00, tabix\ub97c \ud328\uce58\ud558\uc790\ub2c8 EOF\ube14\ub7ed\uc774 \uc911\uac04\uc5d0 \ub4e4\uc5b4\uc788\ub294 \ud30c\uc77c\uc740 \uc880 \ubcc0\ud0dc\uac19\uae30\ub3c4 \ud558\uace0 <del datetime=\"2014-02-24T14:30:02+00:00\">\ub098\uc774\uac00 \ub4dc\ub2c8\uae4c \ud328\uce58\ud558\uace0 \uc62c\ub9ac\uace0 \ud558\uae30\ub3c4 \uadc0\ucc2e\uace0 \ud574\uc11c<\/del> \uadf8\ub0e5 EOF \ube14\ub7ed\uc744 \ube7c\uace0 \ud30c\uc77c\uc744 \ud569\uce58\ub294 \ud234\uc744 \ud558\ub098 \ub9cc\ub4e4\uc5c8\ub2e4.<\/p>\n<p>\uc5ec\uae30 &#8211;&gt; <a href=\"https:\/\/github.com\/hyeshik\/rnarry\/blob\/master\/general_formats\/bgzf-merge.py\">bgzf-merge.py<\/a><\/p>\n<p>\uc694\uac78\ub85c bgzf \ud569\uce58\uba74 \uc9e0! \ud558\uace0 \uae54\ub054\ud558\uac8c tabix\ub098 \uae30\ud0c0 bgzf \uc785\ub825 \ubc1b\ub294 \ud234\ub4e4\uc774 \uc81c\ub300\ub85c \ub3c8\ub2e4. \ubb3c\ub860 \uc18d\ub3c4\ub294 \ud480\uc5c8\ub2e4\uac00 \ub2e4\uc2dc \ud569\uce58\ub294 \uac83\ud558\uace0\ub294 \ube44\uad50\uac00 \uc548 \ub41c\ub2e4. \u314b\u314b<\/p>\n<pre class=\"nums:false lang:zsh decode:true\">% .\/bgzf-merge.py --output spots-merged.txt.gz spots-{1101,1102,1103}.txt.gz\r\nspots-1101.txt.gz\r\nspots-1102.txt.gz\r\nspots-1103.txt.gz\r\n% tabix -0 -s 1 -b 2 -e 2 spots-merged.txt.gz\r\n% tabix -0 spots-merged.txt.gz 1101|head -2\r\n1101 1 50\r\n1101 7 33\r\n% tabix -0 spots-merged.txt.gz 1102|head -2\r\n1102 10 172\r\n1102 17 10\r\n% tabix -0 spots-merged.txt.gz 1103|head -2\r\n1103 71 9\r\n1103 139 71<\/pre>\n<p>\uc694\ub807\uac8c~<\/p>\n<p>\uc544. \uc774\uac83\uc740 \uc5b4\ub514\uae4c\uc9c0\ub098 \ud3c9\uc18c\uc5d0 \ub180\uace0 \uc788\ub294 \ud074\ub7ec\uc2a4\ud130\uac00 \uc606\uc5d0 \uc788\uc744 \ub54c \uc598\uae30\ub2e4. \uc5c6\uc73c\uba74&#8230;&#8230; \uadf8\ub0e5 C\ub85c \uace0\uace0 -\u3147-;;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>bed \ud30c\uc77c\uc5d0 2\uc5b5 \uc904 \uc9dc\ub9ac \ub370\uc774\ud130\uac00 \ub4e4\uc5b4\uc788\ub2e4. \ub300\ucda9 \ud30c\uc774\uc36c\uc73c\ub85c \uacc4\uc0b0\ud560 \uac78 \ub9cc\ub4e4\uc5b4\ubcf4\ub2c8 1\ubc88 \uc5fc\uc0c9\uccb4 \ucc98\ub9ac\uc5d0\ub9cc \uc2f1\uae00 \ud504\ub85c\uc138\uc2a4\ub85c \ud558\ub8e8 \uc815\ub3c4 \uac78\ub838\ub2e4. \uc774\ub7f0 \ub370\uc774\ud130\uac00 \ud558\ub098\ub9cc \uc788\ub294\uac8c \uc544\ub2c8\ub77c \uacc4\uc18d \ub098\uc628\ub2e4. \uc790 \uc774\ub7f0 \uc0c1\ud669\uc774 \uc788\uc73c\uba74 \uc608\uc804 \uac19\uc73c\uba74 &#8220;C\ub85c \uc798 \uc9e0\ub2e4&#8221;\ub77c\ub358\uc9c0 &#8220;\ud30c\uc77c\uc744 \ub098\ub220\uc11c \ud074\ub7ec\uc2a4\ud130\uc5d0 \ub123\ub294\ub2e4.&#8221; \uc815\ub3c4\uac00 \ub2f5\uc774\ub2e4. GNU parallel \uac19\uc740 \uac83 \uc368\ubcf4\ub824\uace0 \ub178\ub825\uc744 \ud574 \ubcfc \uc218\ub3c4 \uc788\uace0. \uc804\uc5d0\ub3c4 \ube14\ub85c\uadf8\uc5d0\uc11c \uc18c\uac1c\ud55c &#8230; <a title=\"BGZF \ud30c\uc77c \uc5ec\ub7ec \uac1c \ud569\uce58\uae30\" class=\"read-more\" href=\"https:\/\/openlook.org\/wp\/merging-multiple-bgzf-files-very-fast\/\" aria-label=\"Read more about BGZF \ud30c\uc77c \uc5ec\ub7ec \uac1c \ud569\uce58\uae30\">Read more<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[13],"tags":[],"class_list":["post-6","post","type-post","status-publish","format-standard","hentry","category-compbio"],"_links":{"self":[{"href":"https:\/\/openlook.org\/wp\/wp-json\/wp\/v2\/posts\/6","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/openlook.org\/wp\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/openlook.org\/wp\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/openlook.org\/wp\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/openlook.org\/wp\/wp-json\/wp\/v2\/comments?post=6"}],"version-history":[{"count":11,"href":"https:\/\/openlook.org\/wp\/wp-json\/wp\/v2\/posts\/6\/revisions"}],"predecessor-version":[{"id":1067,"href":"https:\/\/openlook.org\/wp\/wp-json\/wp\/v2\/posts\/6\/revisions\/1067"}],"wp:attachment":[{"href":"https:\/\/openlook.org\/wp\/wp-json\/wp\/v2\/media?parent=6"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/openlook.org\/wp\/wp-json\/wp\/v2\/categories?post=6"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/openlook.org\/wp\/wp-json\/wp\/v2\/tags?post=6"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}