CnUnix

[알림판목록 I] [알림판목록 II] [글목록][이 전][다 음]
[ CnUnix ] in KIDS
글 쓴 이(By): belami (- 커피 -)
날 짜 (Date): 1995년11월10일(금) 21시49분30초 KST
제 목(Title): 깨진 한글 메일 (복구프로그램)


다음에 한글의 MSB가 벗겨진 것을 복구해주는 프로그램을 붙입니다.
게시하신 글은 약간의 제어코드를 제외하고는 대부분 복구되더군요.

/*-
 * usage: cureksc < infile > outfile
 *   cureksc tries to recover MSB stripped-off KSC5601 Hangul text file.
 *   cureksc uses simple heuristic and automata to guess original characters.
 *   cureksc only works with KSC5601 Korean code, and STRONGLY assumes
 *           original text doesn't contains any Chinese or special characters.
 * $Log: cureksc.c,v $
 * Revision 1.1  1991/12/31  18:02:06  rhee
 * Initial revision
 *
 *
 */
#include <stdio.h>
#include <ctype.h>                                   
#include <string.h>
#include <assert.h>

main()
{
    short           c1;
    short           c2;
    unsigned int    hc;
    int             status;

    status = 0;                 /* status == 0. english */
    c1 = 0;                     /* previous-char */
    while ((c2 = getchar()) != EOF) {
        c2 &= 0x007f;
        c2 |= 0x0080;
        switch (status) {
            case 0:             /* english */
                if (c2 >= 0x00b0 && c2 <= 0x00c8) {     /* possibly a hangul1 
*/                    status = 1;
                } else {
                    status = 0;
                    putchar(c2 & 0x007f);            
                }
                break;
            case 1:             /* possiblly a hangul1 */
                if (c2 >= 0x00a1 && c2 <= 0x00fe) {     /* possibly a hangul2 
*/                    status = 0; /* maybe hangul, flush */
                    putchar(c1);
                    putchar(c2);
                } else if (c2 >= 0x00b0 && c2 <= 0x00c8) {
                    status = 1;
                    putchar(c1 & 0x007f);
                } else {
                    status = 0;
                    putchar(c1 & 0x007f);
                    putchar(c2 & 0x007f);
                }
                break;
            default:            /* error */
                assert(0);
        }
        c1 = c2;
    }
}                                                    

[알림판목록 I] [알림판목록 II] [글 목록][이 전][다 음]
키 즈 는 열 린 사 람 들 의 모 임 입 니 다.