POJ3865 UVA1592 UVALive4592 Database【map+排序+哈希函数】

Database

Time Limit: 5000MS   Memory Limit: 65536K
Total Submissions: 1627   Accepted: 288   Special Judge

Description

Peter studies the theory of relational databases. Table in the relational database consists of values that are arranged in rows and columns. 
There are different normal forms that database may adhere to. Normal forms are designed to minimize the redundancy of data in the database. For example, a database table for a library might have a row for each book and columns for book name, book author, and author's email. 
If the same author wrote several books, then this representation is clearly redundant. To formally define this kind of redundancy Peter has introduced his own normal form. A table is in Peter's Normal Form (PNF) if and only if there is no pair of rows and a pair of columns such that the values in the corresponding columns are the same for both rows. 

How to compete in ACM ICPC Peter [email protected]
How to win ACM ICPC Michael [email protected]
Notes from ACM ICPC champion Michael [email protected]


The above table is clearly not in PNF, since values for 2rd and 3rd columns repeat in 2nd and 3rd rows. 
However, if we introduce unique author identifier and split this table into two tables - one containing book name and author id, and the other containing book id, author name, and author email, then both resulting tables will be in PNF. 

How to compete in ACM ICPC 1
How to win ACM ICPC 2
Notes from ACM ICPC champion 2


 

1 Peter [email protected]
2 Michael [email protected]


Given a table your task is to figure out whether it is in PNF or not. 

Input

The first line of the input file contains two integer numbers n and m (1 <= n <= 10 000, 1 <= m <= 10), the number of rows and columns in the table. The following n lines contain table rows. Each row has m column values separated by commas. Column values consist of ASCII characters from space (ASCII code 32) to tilde (ASCII code 126) with the exception of comma (ASCII code 44). Values are not empty and have no leading and trailing spaces. Each row has at most 80 characters (including separating commas).

Output

If the table is in PNF write to the output file a single word "YES" (without quotes). If the table is not in PNF, then write three lines. On the first line write a single word "NO" (without quotes). On the second line write two integer row numbers r1 and r2 (1 <= r1, r2 <= n, r1 \ne r2), on the third line write two integer column numbers c1 and c2 (1 <= c1, c2 <= m, c1 \nec2), so that values in columns c1 and c2 are the same in rows r1 and r2.

Sample Input

Sample Input #1:
3 3
How to compete in ACM ICPC,Peter,[email protected]
How to win ACM ICPC,Michael,[email protected]
Notes from ACM ICPC champion,Michael,[email protected]

Sample Input #2:
2 3
1,Peter,[email protected]
2,Michael,[email protected]

Sample Output

Sample Output #1:
NO
2 3
2 3

Sample Output #2:
YES

Source

Northeastern Europe 2009

Regionals 2009 >> Europe - Northeastern

问题链接POJ3865 UVA1592 UVALive4592 Database

问题描述:(略)

问题分析

  这个问题似乎与数据库有关,是有关数据的表中数据关系问题。

  该问题死查找是否存在(r1,c1)=(r2,c1)且(r1,c2)=(r2,c2),即不同的2行,对应2列字值(字符串)相同。

  首先可以用STL的map,将字符串映射到整数,然后再进行处理。对于这一点,也可以使用哈希函数构造哈希表来实现。

  进一步就需要进行穷尽的比较。然而,直接暴力很可能TLE,所以需要采用一些技巧,尽量缩短计算时间。有关奥妙都在代码之中。

程序说明

  函数BKDRHash()是计算字符串的哈希值的函数。

  后两个程序在POJ中出现TLE可以说明,使用STL编写程序有简洁和高质量(不易出错)的效果,但是需要付出一定的时空代价,特别是时间方面。

  另外,从这个问题的题解可以知道,可以用哈希函数替换字符串的map。

参考链接:(略)

题记:计算时间是永恒的话题。

AC的C++语言程序如下:

/* POJ3865 UVA1592 UVALive4592 Database */

#include <iostream>
#include <algorithm>
#include <stdio.h>
#include <string.h>

using namespace std;

typedef unsigned long long ULL;

const int ROW = 10000;
const int COL = 10;
ULL db[ROW][COL];
char s[2048];

ULL BKDRHash(char s[])
{
    ULL seed = 131;
    ULL hash = 0;
    for(int i = 0; s[i]; i++)
        hash = hash * seed + s[i];
    return hash;
}

struct Node {
    ULL c1, c2;
    int idx;
} a[ROW];

bool cmp(Node x, Node y)
{
    return x.c1 == y.c1 ? x.c2 < y.c2 : x.c1 < y.c1;
}

void solve(int n, int m)
{
    for(int c1 = 0; c1 < m; c1++)
        for(int c2 = c1 + 1; c2 < m; c2++) {
            for(int row = 0; row < n; row++) {
                a[row].c1 = db[row][c1];
                a[row].c2 = db[row][c2];
                a[row].idx = row;
            }
            sort(a, a + n, cmp);
            for(int row = 1; row < n; row++)
                if(a[row].c1 == a[row - 1].c1 && a[row].c2 == a[row - 1].c2) {
                    printf("NO\n%d %d\n%d %d\n", a[row - 1].idx + 1, a[row].idx + 1, c1 + 1, c2 + 1);
                    return;
                }
        }
    printf("YES\n");
}

int main()
{
    int n, m;
    while(~scanf("%d%d", &n, &m)) {
        getchar();
        for(int i = 0; i < n; i++) {
            gets(s);
            int j = 0;
            char *p = strtok(s, ",");
            while(p) {
                db[i][j++] = BKDRHash(p);
                p = strtok(NULL, ",");
            }
        }

        solve(n, m);
    }

    return 0;
}

AC的C++语言程序(POJ3865出现TLE)如下:

/* POJ3865 UVA1592 UVALive4592 Database */

#include <iostream>
#include <map>
#include <stdio.h>
#include <string.h>

using namespace std;

typedef unsigned long long ULL;

const int ROW = 10000;
const int COL = 10;
ULL db[ROW][COL];
char s[2048];

ULL BKDRHash(char s[])
{
    ULL seed = 131;
    ULL hash = 0;
    for(int i = 0; s[i]; i++)
        hash = hash * seed + s[i];
    return hash;
}

void solve(int n, int m)
{
    for(int c1 = 0; c1 < m; c1++)
        for(int c2 = c1 + 1; c2 < m; c2++) {
            map<pair<ULL, ULL>, int> m;
            for(int row = 0; row < n; row++) {
                pair<int,int> p = make_pair(db[row][c1], db[row][c2]);
                if(m.count(p)) {
                    printf("NO\n");
                    printf("%d %d\n", m[p] + 1, row + 1);
                    printf("%d %d\n", c1 + 1, c2 + 1);
                    return;
                } else
                    m[p] = row;
            }
        }
    printf("YES\n");
}

int main()
{
    int n, m;
    while(~scanf("%d%d", &n, &m)) {
        getchar();
        for(int i = 0; i < n; i++) {
            gets(s);
            int j = 0;
            char *p = strtok(s, ",");
            while(p) {
                db[i][j++] = BKDRHash(p);
                p = strtok(NULL, ",");
            }
        }

        solve(n, m);
    }

    return 0;
}

AC的C++语言程序(POJ3865出现TLE)如下:

/* POJ3865 UVA1592 UVALive4592 Database */

#include <iostream>
#include <map>
#include <stdio.h>

using namespace std;

const int ROW = 10000;
const int COL = 10;
int db[ROW][COL];
int idno = 0;

map<string, int> id;
int getid(const string& s)
{
    if(!id.count(s))
        id[s] = ++idno;
    return id[s];
}

void solve(int n, int m)
{
    for(int c1 = 0; c1 < m; c1++)
        for(int c2 = c1 + 1; c2 < m; c2++) {
            map<pair<int, int>, int> m;
            for(int row = 0; row < n; row++) {
                pair<int,int> p = make_pair(db[row][c1], db[row][c2]);
                if(m.count(p)) {
                    printf("NO\n");
                    printf("%d %d\n", m[p] + 1, row + 1);
                    printf("%d %d\n", c1 + 1, c2 + 1);
                    return;
                } else
                    m[p] = row;
            }
        }
    printf("YES\n");
}

int main()
{
    int n, m;
    while(~scanf("%d%d", &n, &m)) {
        string s;

        getchar();
        for(int i = 0; i < n; i++) {
            getline(cin, s);
            int start = 0;
            for(int j = 0; j < m; j++) {
                int pos = s.find(',', start);
                if(pos == (int)string::npos)
                    pos = s.length();
                db[i][j] = getid(s.substr(start, pos - start));
                start = pos + 1;
            }
        }

        solve(n, m);
    }

    return 0;
}

猜你喜欢

转载自blog.csdn.net/tigerisland45/article/details/82312555