Lucene入门例子

發表於 2014-08-27

开始Lucene之路,从官网下了最新的4.9.0,从先从小例子开始.
建索引

package org.dsl;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.document.TextField;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriterConfig.OpenMode;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.store.FSDirectory;
import org.apache.lucene.util.Version;
import java.io.File;
import java.io.IOException;
public class Index {
    public static void main(String[] args) throws IOException {
        String INDEX_DIR = "e:\\index";
        Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_4_9);
        IndexWriterConfig iwc = new IndexWriterConfig(Version.LUCENE_4_9, analyzer);
        IndexWriter writer = null;
        iwc.setOpenMode(OpenMode.CREATE);
        iwc.setUseCompoundFile(false);
        try {
            writer = new IndexWriter(FSDirectory.open(new File(INDEX_DIR)), iwc);
            Document doc = new Document();
            doc.add(new TextField("title", "who are you, you are a man", Field.Store.YES));
            doc.add(new TextField("content", "A long way to go there. Please drive a car", Field.Store.NO));
            writer.addDocument(doc);
            doc = new Document();
            doc.add(new TextField("title", "are you sure", Field.Store.YES));
            doc.add(new TextField("content", "He is a good man. He is a driver", Field.Store.NO));
            writer.addDocument(doc);
            writer.commit();
            writer.close();
        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
    }

}

搜索

package org.dsl;
import java.io.File;
import java.util.Date;
import org.apache.lucene.document.Document;
import org.apache.lucene.index.DirectoryReader;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.index.Term;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.TermQuery;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.search.TopDocs;
import org.apache.lucene.store.FSDirectory;
public class Search {
    private Search() {}
    public static void main(String[] args) throws Exception {
        String index = "e:\\index";
        IndexReader reader = DirectoryReader.open(FSDirectory.open(new File(index)));
        IndexSearcher searcher = new IndexSearcher(reader);
        String queryString = "driver";
        Query query = new TermQuery(new Term("content", queryString));
        System.out.println("Searching for: " + query.toString());
        Date start = new Date();
        TopDocs results = searcher.search(query, null, 100);
        Date end = new Date();
        System.out.println("Time: "+(end.getTime()-start.getTime())+"ms");
        ScoreDoc[] hits = results.scoreDocs;
        int numTotalHits = results.totalHits;
        System.out.println(numTotalHits + " total matching documents");
        for (int i = 0; i < hits.length; i++) {
            String output = "";
            Document doc = searcher.doc(hits[i].doc);
            output += "doc="+hits[i].doc+" score="+hits[i].score;
            String title = doc.get("title");
            if (title != null) {
                output += " " + title;
            }
            System.out.println(output);
        }
        reader.close();
    }
}

联系作者

数字漩涡

By robinjia

發表於 2014-08-22

还在学校的时候，绍祝师兄每次面试回来，如果有趣题都会和我讨论，因为正好坐他边上，记得这题也是那时讨论的一题。记得当时想了一会后就做出来了。只是时隔多年，再次遇到这题，已经忘记当初是怎么做了。

题目很简单，对于3 打印
1 2 3
8 9 4
7 6 5
对于4，打印
1 2 3 4
12 13 14 5
11 16 15 6
10 9 8 7

观察之后发现规律，先是向右一直走，之后向下一直走，之后向左，最后向上，每次变换方向的原因有两个，一个是走到矩形的边界，另一个是沿着这个方向走，前面的一个位置已经走过了。在当前位置，要找下一个有效位置，只需按顺序遍历上面四个方向即可。写成代码如下：

def spiral_number(N):
    step_x = [0, 1, 0, -1]
    step_y = [1, 0, -1, 0]
    a = [[0 for j in xrange(N)] for j in xrange(N)]
    dir = 0
    i = 0
    j = 0
    a[i][j] = 1
    n = 2
    while n <= N ** 2:
        x = i + step_x[dir]
        y = j + step_y[dir]
        if x >= 0 and x < N and y >= 0 and y < N and a[x][y] == 0:
            a[x][y] = n
            n += 1
            i = x
            j = y
        else:
            dir = (dir + 1 + len(step_x)) % len(step_x)

    for i in xrange(N):
        for j in xrange(N):
            print a[i][j],
        print

spiral_number(4)

联系作者

Python筛法求素数的优化

By robinjia

發表於 2014-08-16

在pythontip上做题时,有这样一道题，
给你一个正整数N(1 <= N <= 10000000)，求{1,2,3,…,N}中质数的个数。
如N=3，输出2.
也就是求N以内质数的个数。
刚开始用以前写的筛法写了一个最初版本，

N = 10000000
primes = [True for i in xrange(N + 1)] 
primes[0] = primes[1] = False
for i in xrange(2, N + 1): 
    if not primes[i]:
        continue
    n = i * i 
    while n < N + 1:
        primes[n] = False
        n += i
print len([i for i in xrange(N + 1) if primes[i]])

发现超时，用time模块的clock测试,用了16s.之后一步一步优化，先将len([i for i in xrange(N + 1) if primes[i]])这句改成primes.count(True)时间缩短到14s,之后看了讨论组里的讨论,将

n = i * i 
    while n < N + 1:
        primes[n] = False
        n += i

这部分改成

1	primes[i * i:N + 1:i] = [False] * ((N - i * i) / i + 1)

时间没有明显的变化，因为看到[False] ((N - i i) / i + 1)，于是将[True for i in xrange(N + 1)] 这行改成[True] * (N + 1) ，速度明显加快，只用了4s,经过这样优化后，程序变成了

N = 10000000
primes = [True] * (N + 1) 
primes[0] = primes[1] = False
for i in xrange(2, N + 1): 
    if not primes[i]:
        continue
    primes[i * i:N + 1:i] = [False] * ((N - i * i) / i + 1)
print primes.count(True)

之后想办法将for循环去掉。于是将它改成

i = 2
while i * i <= N:
    if primes[i]:
        primes[i * i:N + 1:i] = [False] * ((N - i * i) / i + 1)
    i += 1

最终为

N = 10000000
primes = [True] * (N + 1) 
primes[0] = primes[1] = False
i = 2
while i * i <= N:
    if primes[i]:
        primes[i * i:N + 1:i] = [False] * ((N - i * i) / i + 1)
    i += 1
print primes.count(True)

时间跑到了2s以内，提交后通过了。

之后将之前求素数的程序筛法得到素数进行修改,得到

def getPrimes(n):
    primes = [True] * (n + 1)
    primes[0] = primes[1] = False
    i = 2
    while i * i <= n:
        if primes[i]:
            primes[i * i:n + 1:i] = [False] * ((n - i * i) / i + 1)
        i += 1
    return [i for i in xrange(n + 1) if primes[i]]

联系作者

最大子矩阵

By robinjia

發表於 2014-08-16

昨天去面试，面试官出了一道最大连续子序列的题目后，很快就做出来，因为前不久还做过笔记的。之后出了一道，最大子矩阵的题目。也就是给出一个矩阵，如：
0 -2 -7 0
9 2 -6 2
-4 1 -4 1
-1 8 0 -2
求它的子矩阵的最大和。
如
9 2
-4 1
-1 8
是最大子矩阵，和为15.

想了一会之后，找到了一个转化为最大连续子序列的办法。也就是先对列求和，之后再用最大连续子序列的方法。给出这个办法后，还想考虑优化，只是一直想不出来。回来之后，想起编程之美上有类似的题目，看了之后，没想到已经是最优的了。又想起以前在POJ应该做过类似的题目，于是找到了POJ1050- To The Max. 编写代码如下：

#include <stdio.h>
#include <stdlib.h>
#define N 100
int a[N][N];
int b[N];
int max_sub_array(int *a, int n) {
    int max_sum = a[0];
    int sum = a[0];
    for (int i = 1; i < n; i++) {
        if (sum < 0) {
            sum = 0;
        }
        sum += a[i];
        if (sum > max_sum) {
            max_sum = sum;
        }
    }
    return max_sum;
}
int main() {
    int n;
    scanf("%d", &n);
    for (int i = 0; i < n; i++) {
        for (int j = 0; j < n; j++) {
            scanf("%d", &a[i][j]);
        }
    }
    int max_sum = -127 * 100 * 100;
    for (int i = 0; i < n; i++) {
        for (int k = 0; k < n; k++) {
            b[k] = 0;
        }
        for (int j = i; j < n; j++) {
            for (int k = 0; k < n; k++) {
                b[k] += a[j][k];
            }
            int sum = max_sub_array(b, n);
            if (sum > max_sum) {
                max_sum = sum;
            }
        }
    }
    printf("%d\n", max_sum);
    return 0;
}

联系作者

HDU2222-Keywords Search

By robinjia

發表於 2014-08-16

AC自动机作为多模式匹配的经典方法，在关键词过滤中有重要应用，在我看来，AC自动机主要是这样几个步骤，第一步是先建立关键词的trie数，第二步是构建自动机，建立关键词之间的联系，第三步是利用第二步构建的自动机进行检索。HDU2222-关键词搜索是AC自动机的一个简单应用。关于AC自动机的资料，可以看这里http://www.notonlysuccess.com/index.php/aho-corasick-automaton/.
问题描述
现在，搜索引擎如谷歌，百度等已经走进每个人的生活。
Wiskey也想在他的图片检索系统中实现这个功能。
每一张图片有一段描述，当用户输入关键字查找图片时，系统会将这些关键字与图片的描述进行匹配，然后显示与这些关键字最匹配的图片。
为了简化用题，这里给你出一段图片的描述，和一些关键字，请你告诉我将匹配多少个关键字。
输入
第一行是一个整数，它的意思是有多少个测试用例。
每一个测试用例包含整数N，它的意思是有多少个关键字。(N <= 10000)
每一个关键字是由’a’-‘z’的字符组成，长度不超过50.
用例的最后一行是描述，长度不超过1000000.
示例输入
1
5
she
he
say
shr
her
yasherhs
示例输出
3

解法：
AC自动机的简单应用。有一个坑是，输入中关键词存在重复时，要计算多次。

代码如下

#include <iostream>
#include <stdio.h>
#include <string.h>
#include <queue>
using namespace std;
const int NUM = 26;
struct NODE {
    int cnt;
    NODE *fail;
    NODE *next[NUM];
    NODE () {
        cnt = 0;
        fail = NULL;
        memset(next, 0, sizeof(next));
    }
};

void insert(NODE *root, char *s) {
    NODE *cur = root;
    while (*s) {
        int index = *s - 'a';
        if (!cur->next[index]) {
            cur->next[index] = new NODE();
        }
        cur = cur->next[index];
        s++;
    }
    cur->cnt++;
} 
void ac_build(NODE *root) {
    queue <NODE *> q;
    NODE *cur;
    root->fail = NULL;
    for (int i = 0; i < NUM; i++) {
        if (root->next[i]) {
            root->next[i]->fail = root;
            q.push(root->next[i]);
        }
    } 
    while (!q.empty()) {
        cur = q.front();
        q.pop();
        for (int i = 0; i < NUM; i++) {
            if (cur->next[i]) {
                q.push(cur->next[i]);
                NODE *temp = cur->fail;
                while (temp) {
                    if (temp->next[i]) {
                        cur->next[i]->fail = temp->next[i];
                        break;
                    }
                    temp = temp->fail;
                }
                if (temp == NULL) {
                    cur->next[i]->fail = root;
                }
            }
        }
    }
}
int ac_find(NODE *root, char *s) {
    int sum = 0;
    NODE *cur = root;
    while (*s) {
        int index = *s - 'a';
        while (cur->next[index] == NULL && cur != root) {
            cur = cur->fail;
        }
        cur = (cur->next[index] == NULL) ? root : cur->next[index]; 
        NODE *temp = cur;
        while (temp != root && temp->cnt != -1) {  
            sum += temp->cnt;
            temp->cnt = -1;
            temp = temp->fail;
        }
        s++;
    }
    return sum;
}
int main() {
    int ncase, n;
    char word[51];
    char desc[1000001];
    scanf("%d", &ncase);
    while (ncase--) {
        NODE *root = new NODE();
        scanf("%d", &n);
        for (int i = 0; i < n; i++) {
            scanf("%s", word);
            insert(root, word);
        }
        scanf("%s", desc);
        ac_build(root);
        int res = ac_find(root, desc);
        printf("%d\n", res);
    }
    return 0;
}

联系作者

POJ2001-Shortest Prefixes

By robinjia

發表於 2014-08-14

Trie又称为前缀树或字典树，它有许多重要用途，如搜索提示，以及作为AC自动机的基础。POJ2001-最短前缀这题是Trie的基础应用。
Description
A prefix of a string is a substring starting at the beginning of the given string. The prefixes of “carbon” are: “c”, “ca”, “car”, “carb”, “carbo”, and “carbon”. Note that the empty string is not considered a prefix in this problem, but every non-empty string is considered to be a prefix of itself. In everyday language, we tend to abbreviate words by prefixes. For example, “carbohydrate” is commonly abbreviated by “carb”. In this problem, given a set of words, you will find for each word the shortest prefix that uniquely identifies the word it represents.

In the sample input below, “carbohydrate” can be abbreviated to “carboh”, but it cannot be abbreviated to “carbo” (or anything shorter) because there are other words in the list that begin with “carbo”.

An exact match will override a prefix match. For example, the prefix “car” matches the given word “car” exactly. Therefore, it is understood without ambiguity that “car” is an abbreviation for “car” , not for “carriage” or any of the other words in the list that begins with “car”.
Input
The input contains at least two, but no more than 1000 lines. Each line contains one word consisting of 1 to 20 lower case letters.
Output
The output contains the same number of lines as the input. Each line of the output contains the word from the corresponding line of the input, followed by one blank space, and the shortest prefix that uniquely (without ambiguity) identifies this word.
Sample Input
carbohydrate
cart
carburetor
caramel
caribou
carbonic
cartilage
carbon
carriage
carton
car
carbonate
Sample Output
carbohydrate carboh
cart cart
carburetor carbu
caramel cara
caribou cari
carbonic carboni
cartilage carti
carbon carbon
carriage carr
carton carto
car car
carbonate carbona
描述
一个字符串的前缀是从头开始的字符串的子串。”carbon”的前缀有：”c”,”ca”,”car”,”carb”,”carbo”和”carbon”.注意在这个问题中空串不认为是前缀，但是每个非空字符串自身可以认为是一个前缀。在日常用于中，我们倾向于用前缀来缩写词。例如，“carbohydrate”常缩写成”carb”.在这个问题中，我们给出一系列单词，要求能够唯一标识单词的最短前缀。

例如在下面的例子中，”carbohydrate”可以缩写成”carboh”,但是它不能缩写成”carbo”(或者更短),因为在这些词中，还有一个词是由”carbo”开始的.

精确匹配将覆盖前缀匹配。例如，前缀”car”精确匹配单词”car”.因此,”car”可以毫无歧义的认为是”car”的简写,而不是”carriage”的，或者其它一些由”car”作为前缀的单词.
输入：
输入至少包含两个，不超过1000行，每行宝行一个由1到20个字母组成的单词。
输出：
输出包含与输入相同的行数。每一行输出由对应的输入组成，跟着一个空格，之后是唯一(无歧义)标示单词的最短前缀.
示例输入：
carbohydrate
cart
carburetor
caramel
caribou
carbonic
cartilage
carbon
carriage
carton
car
carbonate
示例输出：
carbohydrate carboh
cart cart
carburetor carbu
caramel cara
caribou cari
carbonic carboni
cartilage carti
carbon carbon
carriage carr
carton carto
car car
carbonate carbona
解答：
在节点中增加一个time字段来记录路径被经过的次数，如果time=1,则只出现一次，说明可以用来标示单词，作为最短前缀。
代码如下：

#include
#include
#include
using namespace std;
const int NUM = 26;
const int MAX = 1000;
const int LENGTH = 21;
char words[MAX][LENGTH];
struct NODE {
    int time;
    NODE *next[NUM];
    NODE () {
        time = 0;
        memset(next, 0, sizeof(next));
    }
};

void insert(NODE *root, char *s) {
    NODE *cur = root;
    while (*s != '\0') {
        int index = *s - 'a';
        if (!cur->next[index]) {
            cur->next[index] = new NODE();
            cur = cur->next[index];
            cur->time = 1;
        } else {
            cur = cur->next[index];
            (cur->time)++;
        }
        s++;
    }
}
void search(NODE *root, char *s) {
    NODE *cur = root;
    while (*s != '\0') {
        int index = *s - 'a';
        cur = cur->next[index];
        if (cur->time != 1) {
            printf("%c", *s);
            s++;
        } else {
            printf("%c", *s);
            break;
        }
    }
    printf("\n");
}
int main() {
    int total = 0;
    NODE * root = new NODE();
    while (scanf("%s", words[total]) != EOF) {
        insert(root, words[total]);
        total++;
    }
    for (int i = 0; i < total; i++) {
        printf("%s ", words[i]);
        search(root, words[i]);
    }
    return 0;
}

联系作者

最小的K个数

By robinjia

發表於 2014-08-14

对于求最大的K个数和最小的K个数，一个解决的办法是使用堆，这里以最小的K个数为例。
题目描述：
输入n个整数，找出其中最小的K个数。例如输入4,5,1,6,2,7,3,8这8个数字，则最小的4个数字是1,2,3,4,。

输入：
每个测试案例包括2行：
第一行为2个整数n，k(1<=n，k<=200000)，表示数组的长度。
第二行包含n个整数，表示这n个数，数组中的数的范围是[0,1000 000 000]。

输出：
对应每个测试案例，输出最小的k个数，并按从小到大顺序打印。

样例输入：
8 4
4 5 1 6 2 7 3 8
样例输出：
1 2 3 4

对于这题，可以使用堆来解决。首先建立一个K个元素的大顶堆，对于之后的n-k个元素，每个与堆顶比较，如果大于堆顶，则它不可能是最小的K个数之一，如果小于堆顶，则将堆顶替换，并重建大顶堆。之后剩下的K个元素就是最小的K个数。对它们从小到大排序就可以得到结果。
写成代码如下：

#include <stdio.h>
#include <stdlib.h>
void swap(int &a, int &b) {
    int temp = a;
    a = b;
    b = temp;
}
void adjust_heap(int *a, int cur, int n) {
    int left, right;
    while (true) {
        left = 2 * cur + 1;
        right = 2 * cur + 2;
        int index = cur;
        if (left < n && a[left] > a[index]) {
            index = left;
        }
        if (right < n && a[right] > a[index]) {
            index = right;
        }
        if (index != cur) {
            swap(a[cur], a[index]);
            cur = index;
        } else {
            break;
        }
    }
}
void build_heap(int *a, int n) {
    for (int i = (n - 1) / 2; i >= 0; i--) {
        adjust_heap(a, i, n);
    }
}
void heap_sort(int *a, int n) {
    build_heap(a, n);
    for (int i = n - 1; i > 0; i--) {
        swap(a[0], a[i]);
        adjust_heap(a, 0, i);
    }
} 
int main() {
    int n, k, num;
    while(scanf("%d %d", &n, &k) != EOF) {
        int *a = new int[k];
        for (int i = 0; i < n; i++) {
            scanf("%d", &num);
            if (i <= k - 1) {
                a[i] = num;
                if (i == k - 1) {
                    build_heap(a, k);
                }
            } else {
                if (a[0] > num) {
                    swap(a[0], num);
                    adjust_heap(a, 0, k);
                }
            }
        }
        heap_sort(a, k);
        for (int i = 0; i < k - 1; i++) {
            printf("%d ", a[i]);
        }
        printf("%d\n", a[k - 1]);
        delete[] a;
    }
    return 0;
}

联系作者

堆排序

By robinjia

發表於 2014-08-13

对于求最小的K个数和最大的K个数，一种解决办法是使用堆。对于堆，数据结构的书籍中都有讲到，可是面试时不知是紧张还是什么原因，连堆都忘记了，悲哀，真是悲哀。

想来堆排序还是不难的，如果要对n个数字从小到大排序，则先建立这n个数字的大顶堆，之后堆顶与最后一个数字交换，此时就得到最大的数字，且在最后一位中，之后之需要对前n-1个数字排序。这里堆顶与最后一个数字交换后，会破坏了大顶堆，需要重建堆。对于大顶堆，意思就是堆顶的元素是最大的，之后是堆顶的左右子节点。

这里主要就是两个步骤，一个是建立大顶堆，一个是重建堆。
看代码可能会更容易一些

#include <stdio.h>
#include <stdlib.h>
void swap(int &a, int &b) {
    int temp = a;
    a = b;
    b = temp;
}
void adjust_heap(int *a, int cur, int n) {
    int left, right;
    while (true) {
        left = 2 * cur + 1;
        right = 2 * cur + 2;
        int index = cur;
        if (left < n && a[left] > a[index]) {
            index = left;
        }
        if (right < n && a[right] > a[index]) {
            index = right;
        }
        if (index != cur) {
            swap(a[cur], a[index]);
            cur = index;
        } else {
            break;
        }
    }
}
void build_heap(int *a, int n) {
    for (int i = (n - 1) / 2; i >= 0; i--) {
        adjust_heap(a, i, n);
    }
}
void heap_sort(int *a, int n) {
    build_heap(a, n);
    for (int i = n - 1; i > 0; i--) {
        swap(a[0], a[i]);
        adjust_heap(a, 0, i);
    }
} 
int main() {
    int a[] = {10, 2, 5, 7, 6, 13 , 8, 7};
    int n = sizeof(a) / sizeof(int);
    heap_sort(a, n);
    for (int i = 0; i < n - 1; i++) {
        printf("%d ", a[i]);
    }
    printf("%d\n", a[n - 1]);
    return 0;
}

联系作者

POJ2524-Ubiquitous Religions

By robinjia

發表於 2014-08-13

连并查集都忘记是怎么回事了，实在是不应该。还是复习一下为妙。
poj2524这一题是并差集的简单应用，先从这题开始。

There are so many different religions in the world today that it is difficult to keep track of them all. You are interested in finding out how many different religions students in your university believe in.

You know that there are n students in your university (0 < n <= 50000). It is infeasible for you to ask every student their religious beliefs. Furthermore, many students are not comfortable expressing their beliefs. One way to avoid these problems is to ask m (0 <= m <= n(n-1)/2) pairs of students and ask them whether they believe in the same religion (e.g. they may know if they both attend the same church). From this data, you may not know what each person believes in, but you can get an idea of the upper bound of how many different religions can be possibly represented on campus. You may assume that each student subscribes to at most one religion.

input:
The input consists of a number of cases. Each case starts with a line specifying the integers n and m. The next m lines each consists of two integers i and j, specifying that students i and j believe in the same religion. The students are numbered 1 to n. The end of input is specified by a line in which n = m = 0.

output:
For each test case, print on a single line the case number (starting with 1) followed by the maximum number of different religions that the students in the university believe in.

sample input
10 9
1 2
1 3
1 4
1 5
1 6
1 7
1 8
1 9
1 10
10 4
2 3
4 5
4 8
5 8
0 0
sample output
Case 1: 1
Case 2: 7
题目描述
世界上有许多不同的宗教，要记录全部是很困难的。你有兴趣找出在你所在的大学，有多少不同宗教信仰的学生。

你所在的大学有n个学生(0 < n <= 50000).问遍所有学生的宗教信仰是不实际的。并且，一些学生对于表达他们的信仰会觉得不舒服。一个避免这些问题的解决办法是问m(0 <= m <= n(n-1)/2)对学生，他们是否属于同一个宗教(也就是他们同时出现在相同的教堂).从这些数据里，你不能知道每一个人的信仰，但是可以知道校园里宗教数量的一个上界。你可以假设一个学生至多属于一个宗教。

输入：
输入中包含一些测试用例。每个例子由一行包含整数n和m开始。接下来的m行由两个整数i和j组成,i和j属于同一个宗教.学生从1到n编号.输入的结束由一行n = m = 0标示.

输出：
每一个测试用例，输出一个数字标示第几个测试用例(从1开始)跟着是这所大学的所有学生可能的最大宗教信仰数。

#include <stdio.h>
#include <iostream>
using namespace std;
const int MAX=50001;
int father[MAX];
int rank[MAX];
void make_set(int x) {
    father[x] = x;
    rank[x] = 1;
}
int find_set(int x) {
    if (x != father[x]) {
        father[x] = find_set(father[x]);
    }
    return father[x];
}
void union_set(int x, int y) {
    int fx = find_set(x);
    int fy = find_set(y);
    if (fx == fy)
        return;
    if (rank[fx] > rank[fy]) {
        father[fy] = fx;
        rank[fx] += rank[fy] ;
    } else {
        rank[fy] += rank[fx];
        father[fx] = fy;
    }
}
int main() {
    int n, m, a, b;
    int test_case = 0;
    while (true) {
        scanf("%d %d", &n, &m);
        if (n == 0 && m == 0) {
            break;
        }
        test_case++;
        for (int i = 1; i <= n; i++) {
            make_set(i);
        }
        while (m--) {
            scanf("%d %d", &a, &b);
            union_set(a, b);
        }
        int count = 0;
        for (int i = 1; i <= n; i++) {
            if (i == father[i]) {
                count++;
            }
        }
        printf("Case %d: %d\n", test_case, count);
    }
    return 0;
}

联系作者

重温manufactoria

By robinjia

發表於 2014-08-13

最近又玩了一遍manufactoria,发现有些关卡都快忘记怎么过的，这次还是写下来好了。
关卡按照从左到右，从上到下计数。
1. Move robots from the entrance(top) to the exit (bottom)! 将机器从入口(顶部)移到出口(底部)!
简单，不多说
2. If a robot’s string starts with blue, accept. Otherwise, reject! 如果机器以蓝色字符开头，则接受。否则，丢弃。
3. ACCEPT:if there are three or more blues! 如果有三个或三个以上的蓝色的，则接受
4. ACCEPT: if a robot contains NO red! 如果机器不包含红色的，则接受。
5. OUTPUT:The input,but with the first symbol at the end! 对于输入的字符，将第一个字符移到末尾作为输出
6. ACCEPT: if the tape has only alternating colors! 如果只存在交替字符，则接受。
意思就是字符如果是交替出现的则接受，如蓝红蓝红蓝，红蓝红蓝等等，而蓝蓝红，红蓝蓝等出现连续相同的字符，所以不接受
7. OUTPUT：Replace blue with green, and red with yellow! 输出：将蓝色换成绿色，将红色换成黄色。
8. ACCEPT: if the tape ends with two blues! 如果末尾是两个蓝色，则接受。
9. OUTPUT： Put a green at the begining, and a yellow at the end! 输出：对于输入的字符，在开头中添加一个绿色字符，在末尾中添加一个黄色字符。
10. ACCEPT: Strings that begin and end with the same color! 如果开始字符和结束字符时相同，则接受。
11. ACCEPT: With blue as 1 and red as 0, accept odd binary string! 把蓝色当做1，红色当做0，接受奇数二进制数。
其实就是接受蓝色结尾的字符。
12. ACCEPT: Some number of blue, then the same number of red! 接受：一些蓝色的，然后相同数量的红色的。
也就是接受诸如蓝蓝红红，蓝蓝蓝红红红等，当然空字符也要接受，因为空字符代表0个蓝色，0个红色。
解决办法是每次除去一个蓝色和红色
13.OUTPUT: Swap blue for red, and red for blue! 输出：将蓝色换成红色，红色换成蓝色。
也就是将字符串中的颜色互换。
14. OUTPUT: All of the blue, but none of the red! 输出字符串中的所有蓝色字符，不输出红色字符。
也就是将字符串中的所有红色字符去掉，留下蓝色的。
15. OUTPUT： The input, but with the last symbol moved to the front! 输出：对于输入的字符，将最后一个字符移动最前面。
在末尾添加一个绿色，用来标示最后一个字符
16. OUTPUT: With blue as 1 and red as 0, multiply by 8! 输出：把蓝色当做1，红色当做0，将二进制字符串乘以8.
其实也就是在字符串末尾添加3个0，也就是添加三个红色
17.ACCEPT: With blue as 1 and red as 0, accept binary strings > 15! 接受：把蓝色当做1，红色当做0，接受大于15的字符串。
18. An equal number of blue and red, in any order! 只要字符串中包含相同个数的蓝色和红色，则接受。
使用与12相同的办法
19.OUTPUT：Put a yellow in the middle of the (even-lenght) string! 输出：在字符串（偶数个字符串）的中间放置一个黄色字符。
在头尾都添加黄色，之后每个循环，头部的向前移一步，尾部的向后移一步。
20.ACCEPT: X blue, then X red, then X more blue, for any X! 接受： X个蓝色，然后X个红色，接着X个蓝色，对于X没有限制。
也就是接受蓝红蓝，蓝蓝红红蓝蓝等字符串，对于空字符也接受，因为它代表0个蓝色，然后0个红色，接着0个蓝色。
使用与12题相同的办法
21.OUTPUT: The input, but with all blues moved to the front! 输出：对于输入，将字符串中所有的蓝色移到前面。
逆向思维，将红色字符移到后面
22.OUTPUT: With blue as 1 and red as 0, add 1 to the binary string! 输出：将蓝色当做1，红色当做0，将二进制字符串加上1.
在末尾添加一个黄色，每个循环，如果黄色左边的是蓝色，则蓝色变成红色，并且黄色左后退一个字符，
如果黄色左边的红色，则将红色变成蓝色，程序结束。
23. ACCEPT: With blue as 1 and red as 0, accept natural powers of four! 接受：把蓝色当做1，红色当做0，接受四的开方
24.ACCEPT: (Even-length) strings that repeat midway through! 接受：(偶数长度)接受从中间开始重复的字符串
意思就是接受的字符串是偶数长度，前半段和后半段是一样的，如蓝红红蓝红红，
25. ACCEPT: If there are exactly twice as many blues as red! 接受：如果蓝色的个数是红色个数的两倍，则接受。
每个循环，除去两个蓝色和一个红色
26. OUTPUT: Reverse the input string! 输出：将输入的字符串逆转输出
27.OUTPUT: Subtract 1 from the binary string!(Input >= 1) 输出：从二进制字符串中减去1(输入字符串>=1)
在末尾添加一个黄色，每个循环，如果黄色左边的是红色，则红色变成蓝色，并且黄色左后退一个字符，
如果黄色左边的蓝色，则将蓝色变成红色，程序结束。
28.ACCEPT: Perfectly symmetrical strings! 接受：完美对称字符串！
意思就是回文串，也就是从左读到右和从右读到左是一样的。
每个循环，头尾消去的字符是一样的。
29.ACCEPT: Two identical strings, separated by a green! 接受：两个相同的字符串，由绿色隔开。
30. ACCEPT: Read the tape as two numbers, A and B, split by a green: accept if A > B! 接受：读入的字符串当做两个二进制数，A和B，由一个绿色隔开，如果A>B则接受。
每次比较A和B的字符，分四种情况
蓝蓝，继续比较A和B剩余的字符
红红，继续比较A和B剩余的字符
蓝红，这种情况下，有个小技巧，将B的字符再消去一个，这时A剩余的字符比B剩余的字符长，则A>B
红蓝,A剩余的字符比B剩余的字符长，则A>B
31. OUTPUT: Read the tape as two numbers, A and B, split by a green: output A + B! 输出：读入的字符串当做两个二进制数，A和B，由一个绿色隔开，输出A+B的和！
将A和B都转成黄色字符，之后再将黄色字符转成二进制。

阿龙的学习笔记

兼济天下则达,独善其身则穷

Lucene入门例子

联系作者

数字漩涡

联系作者

Python筛法求素数的优化

联系作者

最大子矩阵

联系作者

HDU2222-Keywords Search

联系作者

POJ2001-Shortest Prefixes

联系作者

最小的K个数

联系作者

堆排序

联系作者

POJ2524-Ubiquitous Religions

联系作者

重温manufactoria

联系作者