首先有个article.txt的文件,有如下内容:
Today was a last vacation day. I should goto the class.
I don‘t
like this school’s classes. I don‘t like the chalk and monitor .
I
must write and read, learn and read.
Was it very bad? In class ,We must
understand andlearn the knowledge.
I don‘t like quiz and exercise, I
like experiment.
I think it was a bad day!
It was Sunday.
My
brother and I were at the zoo.
We saw birds ,horses, bears and
monkeys.
The monkeys were very funny. We looked
at them and they
looked at us.
Oh! my god!They looked at a big map.
Do they know
what were on the map? No? Yes?
I don‘t know, but, they could study,
really!
We know, because we study .They study ,they can know?
-----------------------------------------------------------------------------------------------------------
实现功能如下
1.查询单个单词
Executing Query for: was
was occurs 3 times
(line 1)Today was a
last vacation day. I should goto the class.
(line 6)I think it was a
bad day!
(line 7)It was Sunday.
2.查询单个单词的反
Executing Query for: ~(was)
~(was) occurs 12 times
(line 2)I
don‘t like this school’s classes. I don‘t like the chalk and monito
r
.
(line 3)Imust write and read, learn and read.
(line 4)Was it
very bad? In class ,We must understand andlearn the knowledge.
(line
5)I don‘t like quiz and exercise, I like experiment.
(line 8)My
brother and I were at the zoo.
(line 9)We saw birds ,horses, bears
and monkeys.
(line 10)The monkeys were very funny. We looked
(line 11)at them and they looked at us.
(line 12)Oh! my god!They
looked at a big map.
(line 13)Do they know what were on the map? No?
Yes?
(line 14)I don‘t know, but, they could study, really!
(line
15)We know, because we study .They study ,they can know?
3.查询两个单词的并集
Executing Query for: (were | was)
(were | was) occurs 6 times
(line 1)Today was a last vacation day. I should goto the class.
(line
6)I think it was a bad day!
(line 7)It was Sunday.
(line 8)My
brother and I were at the zoo.
(line 10)The monkeys were very funny.
We looked
(line 13)Do they know what were on the map? No? Yes?
4.查询两个单词的交集
Executing Query for: (was & the)
(was & the) occurs 1
time
(line 1)Today was a last vacation day. I should goto the
class.
5.查询复合条件
Executing Query for: ((was & the) | were)
((was & the) |
were) occurs 4 times
(line 1)Today was a last vacation day. I should
goto the class.
(line 8)My brother and I were at the zoo.
(line
10)The monkeys were very funny. We looked
(line 13)Do they know what
were on the map? No? Yes?
-----------------------------------------------------------------------------------------------------------
实现代码:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
|
#include <iostream> #include <fstream> #include <sstream> #include <string> #include <vector> #include <map> #include <set> #include <memory> #include <algorithm> using
namespace std;
class
QueryResult
{ public :
friend
ostream &print(ostream &, const
QueryResult &);
QueryResult(string s, shared_ptr<set< int >> w, shared_ptr<vector<string>> f) :word(s), wordlineSet(w), file(f)
{
//cout<< "QueryResult construst"<<endl;
//cout << *file->begin()<<endl;
}
std::set< int >::iterator begin() { return
wordlineSet->begin(); }
std::set< int >::iterator end() { return
wordlineSet->end(); }
std::shared_ptr<std::vector<std::string>> get_file() { return
file; }
private :
string word;
shared_ptr<set< int >> wordlineSet;
shared_ptr<vector<string>> file;
}; class
TextQuery
{ public :
TextQuery(ifstream &is) : file( new
vector<string>)
{
string text;
while
(getline(is, text))
{
//cout << "TextQuery construct"<<endl;
file->push_back(text);
//cout << text <<endl;
//cout << file->size()<<endl;
int
lineNumber = file->size() - 1; //存入vector的行数应该是实际行数减1
istringstream line(text);
string word;
while
(line >> word)
{
auto
&lines = wordmap[word];
if
(!lines)
lines.reset( new
set< int >);
lines->insert(lineNumber);
/*
关于. 和 -> 操作:
.是对lines本身操作
->是对lines(智能指针)所指向的对象操作
*/
}
}
}
QueryResult query( const
string &s) const
{
static
shared_ptr<set< int >> nodata( new
set< int >);
auto
wordloc = wordmap.find(s);
if
(wordloc == wordmap.end())
return
QueryResult(s, nodata, file);
else
return
QueryResult(s, wordloc->second, file);
}
private :
shared_ptr<vector<string>> file;
map<string, shared_ptr<set< int >>> wordmap;
}; class
Query_base
{ friend
class
Query;
protected :
virtual
~Query_base() = default ;
private :
virtual
QueryResult eval( const
TextQuery&) const
= 0;
virtual
string rep() const
= 0;
}; class
Query
{ friend
Query operator~( const
Query &);
friend
Query operator|( const
Query &, const
Query &);
friend
Query operator&( const
Query &, const
Query &);
public :
Query( const
string&);
QueryResult eval( const
TextQuery &t) const
{
return
q->eval(t);
}
string rep() const
{
return
q->rep();
}
private :
Query(shared_ptr<Query_base> query) : q(query) {}
shared_ptr<Query_base> q;
}; //重载输出(<<)操作符 std::ostream & operator<<(std::ostream &os, const
Query &query)
{ return
os << query.rep();
} //单词查询类 class
WordQuery : public
Query_base {
friend
class
Query;
WordQuery( const
string &s) : query_word(s) {}
QueryResult eval( const
TextQuery &t) const
{ return
t.query(query_word); }
string rep() const
{ return
query_word; };
string query_word;
}; //Query接口实现动态绑定WordQuery inline
Query::Query( const
std::string &s) : q( new
WordQuery(s)) {}
//取反查询 class
NotQuery : public
Query_base {
friend
Query operator~ ( const
Query &); //友元是取反函数
NotQuery( const
Query &q) : query(q) {}
string rep() const
{ return
"~("
+ query.rep() + ")" ; }
QueryResult eval( const
TextQuery &t) const ;
Query query;
}; //实现取反操作, 动态绑定NotQuery对象 //最终使用的是WordQuery类, Query构建需要WordQuery, 再传入NotQuery; inline
Query operator~ ( const
Query &operand)
{ return
shared_ptr<Query_base>( new
NotQuery(operand));
} //二元查询, 没有eval, 则继承纯虚函数 class
BinaryQuery : public
Query_base {
protected :
BinaryQuery( const
Query &l, const
Query &r, std::string s) :
lhs(l), rhs(r), opSym(s) {}
std::string rep() const
{
return
"("
+ lhs.rep() + " "
+ opSym + " "
+ rhs.rep() + ")" ;
}
Query lhs, rhs;
std::string opSym;
}; //取并查询 class
AndQuery : public
BinaryQuery {
friend
Query operator& ( const
Query&, const
Query&);
AndQuery( const
Query& left, const
Query& right) : BinaryQuery(left, right, "&" ) {}
QueryResult eval( const
TextQuery&) const ;
}; inline
Query operator& ( const
Query& lhs, const
Query& rhs) {
return
shared_ptr<Query_base>( new
AndQuery(lhs, rhs));
} //取或查询 class
OrQuery : public
BinaryQuery {
friend
Query operator| ( const
Query&, const
Query&);
OrQuery( const
Query& left, const
Query& right) : BinaryQuery(left, right, "|" ) {}
QueryResult eval( const
TextQuery&) const ;
}; inline
Query operator| ( const
Query& lhs, const
Query& rhs) {
return
shared_ptr<Query_base>( new
OrQuery(lhs, rhs));
} QueryResult OrQuery::eval( const
TextQuery& text) const
{ auto
right = rhs.eval(text), left = lhs.eval(text);
auto
ret_lines = std::make_shared<std::set< int > >(left.begin(), left.end());
ret_lines->insert(right.begin(), right.end());
return
QueryResult(rep(), ret_lines, left.get_file());
} QueryResult AndQuery::eval( const
TextQuery& text) const
{
auto
left = lhs.eval(text), right = rhs.eval(text); //调用的是WordQuery的eval
auto
ret_lines = std::make_shared<std::set< int >>();
set_intersection(left.begin(), left.end(), right.begin(), right.end(),
inserter(*ret_lines, ret_lines->begin()));
return
QueryResult(rep(), ret_lines, left.get_file());
} QueryResult NotQuery::eval( const
TextQuery& text) const
{ auto
result = query.eval(text); //调用WordQuery.eval;
auto
ret_lines = std::make_shared<std::set< int >>();
auto
beg = result.begin(), end = result.end();
auto
sz = result.get_file()->size();
for
( size_t
n = 0; n != sz; ++n) {
if
(beg == end || *beg != n)
ret_lines->insert(n);
else
if
(beg != end)
++beg;
}
return
QueryResult(rep(), ret_lines, result.get_file());
} bool
get_word(std::string& str) {
std::cout << "enter word to look for, or q to quit: "
<< std::endl;
if
(!(std::cin >> str) || str == "q" ){
std::cout << str;
return
false ;
}
else {
std::cout << str;
return
true ;
}
} string make_plural( const
int
length, const
string s1, const
string s2)
{ if
(length == 0)
return
s1;
if
(length == 1)
return
s1;
else
return
s2;
} ostream &print(ostream & os, const
QueryResult &qr)
{ //cout << *(qr.file->begin())<<endl;
//cout<< "print function running"<<endl;
os << qr.word << " occurs "
<< qr.wordlineSet->size() << " "
<< make_plural(qr.wordlineSet->size(), "time" , "times" ) << endl;
for
( auto
num : *qr.wordlineSet)
os << " (line "
<< num + 1 << ")"
<< *(qr.file->begin() + num) << endl;
return
os;
} void
runQueries(ifstream &infile)
{ TextQuery t1(infile);
while
( true )
{
cout << "Enter word to look for,or q to quit"
<< endl;
string s;
if
(!(cin >> s) || s == "q" )
break ;
print(cout, t1.query(s)) << endl;
}
} void
main()
{ ifstream in;
//runQueries(in);
TextQuery file = in;
//Query q = Query("was");
//Query q = ~Query("was");
//Query q = Query("were") | Query("was");
//Query q = Query("was") & Query("the");
Query q = Query( "was" ) & Query( "the" ) | Query( "were" );
const
auto
results = q.eval(file);
cout << "Executing Query for: "
<< q << endl;
print(cout, results) << endl;
} |
-----------------------------------------------------------------------------------------------------------
具体实现细节
1.实现单个单词查询
1
2
3
4
5
6
7
8
9
10
11
|
void
main()
{ ifstream in;
TextQuery file = in;
Query q = Query( "was" );
const
auto
results = q.eval(file);
cout << "Executing Query for: "
<< q << endl;
print(cout, results) << endl;
} |
首先建立 ifstream 类的in对象。
in对象打开目的 article.txt,并存入ifstream流。
建立TextQuery类的file对象,利用拷贝赋值运算符进行赋值。
建立Query类的对象 q,利用拷贝赋值运算符进行赋值,构造函数为 Query(const string&); 。
定义了一个const常量,类型 auto(其实是q.eval(file)的返回值类型,也就是QueryResult类型)。
打印一个string,和Query类的对象 q,q 前面的<<的已经进行了重载 std::ostream & operator<<(std::ostream &os, const Query &query) { return os << query.rep(); }
返回的是q.rep(); 。
最后一行是 print 函数打印 results 对象,具体查看 print 函数。
整个基本过程就如上所说,下面开始说细节。
在main()函数内,有下面一段代码,定义了一个Query q ,用了 拷贝赋值运算符 赋了一个字符串。
1Query q = Query(
"was"
);
调用的构造函数是
Query::Query(const std::string &s) : q(new WordQuery(s)) {}
下面接着说Query的另一个函数eval()
QueryResult Query::eval(const TextQuery &t) const
{
return q->eval(t);
}
对于Query类的对象q的成员函数eval来说
函数返回值类型是:QueryResult
函数形参列表是:const TextQuery &t (TextQuery常量对象 t 的引用)
函数类型是: const (也就是const member function 常量成员函数)
函数体是:{ return q->eval(t);}返回的是类成员 q 指向地址的 eval(t)函数,传递的参数是形参列表的 t(相当于只有一次的递归),Query类有一个成员变量 shared_ptr<Query_base> q; 意思是 : 类型是 Query_base的智能指针 q 。
智能指针 q 所指向的对象就是上面所说的构造函数传给 q 的q(new WordQuery(s))
当 q 找到了WordQuery 的地址,调用WordQuery的eval函数
QueryResult WordQuery::eval(const TextQuery &t) const { return t.query(query_word); }
传递给WordQuery的eval的实参 t 就是 Query的 实参 t(因为是 引用 )
TextQuery 的 t 的query函数如下
QueryResult TextQuery::query(const string &s) const
{
static shared_ptr<set<int>> nodata(new set<int>);
auto wordloc = wordmap.find(s);
if (wordloc == wordmap.end())
return QueryResult(s, nodata, file);
else
return QueryResult(s, wordloc->second, file);
}
返回的是QueryResult类型的实例,(TextQuery的默认构造函数构造了一个wordmap)。
因此main()函数的results就是个 QueryResult 类型的实例。
QueryResult类型有三个成员变量
string word;
shared_ptr<set<int>> wordlineSet;
shared_ptr<vector<string>> file;
最后print函数通过 QueryResult类型的实例 对其成员变量调用,打印结果,过程如原程序所说。
至此第一个查询全部过程和细节解释完毕。
-----------------------------------------------------------------------------------------------------------
2.实现单个单词查询的反
1
2
3
4
5
6
7
8
9
10
11
|
void
main()
{ ifstream in;
TextQuery file = in;
Query q = ~Query( "was" );
const
auto
results = q.eval(file);
cout << "Executing Query for: "
<< q << endl;
print(cout, results) << endl;
} |
首先建立 ifstream 类的in对象。
in对象打开目的 article.txt,并存入ifstream流。
建立TextQuery类的file对象,利用拷贝赋值运算符进行赋值。利用构造函数 TextQuery(ifstream &is) : file(new vector<string>); 创建 map<string, shared_ptr<set<int>>>类型的 wordmap; 。
建立Query类的对象 q(经过构造函数构造,是一个指向基类的智能指针),利用重载的运算符 ~ 建立一个Query类型的右值(C++分左值和右值,如果不懂,自行百度下~),~ 运算符会返回一个,
inline Query operator~ (const Query &operand)
{
return shared_ptr<Query_base>(new NotQuery(operand));
}
也就是指向 NotQuery 类型的 <Query_base>类型 shared_ptr 智能指针。
定义了一个const常量,类型 auto(其实是q.eval(file)的返回值类型,也就是QueryResult类型)。
QueryResult Query::eval(const TextQuery &t) const
{
return q->eval(t);
}
目的是调用 Query类的成员 智能指针 q 指向的 eval()函数,上面说了 q 指向的类型了,q 指向的是 NotQuery 类型,调用 NotQuery 类的 eval()函数,
QueryResult NotQuery::eval(const TextQuery& text) const
{
auto result = query.eval(text); //调用WordQuery.eval;
auto ret_lines = std::make_shared<std::set<int>>();
auto beg = result.begin(), end = result.end();
auto sz = result.get_file()->size();
for (size_t n = 0; n != sz; ++n){
if (beg == end || *beg != n)
ret_lines->insert(n);
else if (beg != end)
++beg;
}
return QueryResult(rep(), ret_lines, result.get_file());
}
NotQuery 类的 eval()函数:
第一行,调用NotQuery类的数据成员 Query query,其实得到的是一个QueryResult 类型的一个对象,根本原因就是进行了一次单个单词的查询。
第二行,建立了一个智能指针ret_lines 指向类型 set<int> 的空指针。
第三行,调用QueryResult 类的成员函数,set<int>::iterator begin() { return wordlineSet->begin(); } set<int>::iterator end() { return wordlineSet->end(); } 。
第四行,调用QueryResult 的成员函数,返回的是最初定义article的 size()。
第五行,for循环,循环次数是 size()大小。循环体内,当 *beg != n 时,向ret_lines 中插入当前行号(以查询 ~was 为例子,此时的 result 是查询 was 的result,因此此时*beg 也就是 was存在的行数。如果was存在当前的行数 与文章整体行数不相等,就把此时的行数insert 进ret_lines。否则,如果相等,就把beg 迭代器向后移动一位。 )
最后一行,返回一个 QueryResult 类型的对象。
打印一个string,和Query类的对象 q,q 前面的<<的已经进行了重载 std::ostream & operator<<(std::ostream &os, const Query &query) { return os << query.rep(); }
返回的是q.rep(); 。
最后一行是 print 函数打印 results 对象,具体查看 print 函数。
-----------------------------------------------------------------------------------------------------------
3.实现两个单词查询的并集
1
2
3
4
5
6
7
8
9
10
|
void
main()
{ ifstream in;
TextQuery file = in;
Query q = Query( "were" ) | Query( "was" );
const
auto
results = q.eval(file);
cout << "Executing Query for: "
<< q << endl;
print(cout, results) << endl;
} |
从 Query q = Query("were") | Query("was"); 说起。
调用重载的 | 运算符,返回一个 return shared_ptr<Query_base>(new OrQuery(lhs, rhs)); 智能指针。
对象 q 指向了OrQuery(lhs, rhs)的地址。
当const auto results = q.eval(file); q 调用 Query的eval()函数,指向到了OrQuery 的eval()函数上。
QueryResult OrQuery::eval(const TextQuery& text) const
{
auto right = rhs.eval(text), left = lhs.eval(text);
auto ret_lines = std::make_shared<std::set<int> >(left.begin(), left.end());
ret_lines->insert(right.begin(), right.end());
return QueryResult(rep(), ret_lines, left.get_file());
}
第一行(函数体内),用OrQuery 的成员变量,建立了right,left。(都是基于对单个单词的调用。)返回类型都是QueryResult。
第二行,建立了一个ret_lines 的智能指针,指向了一个建立了的set<int> ,set 的内容是 left.begin(), left.end()。
第三行,向ret_lines 指向的set 插入 insert(right.begin(), right.end()); (因为set是无重复的,即使插入的有重复,也会自动去重。)
第四行,返回QueryResult 对象。(其中有个 left.get_file()参数,其实换成right.get_file()也可以,因为这就是传一个get_file()函数返回的对象)
剩下的就跟原来一样了。
-----------------------------------------------------------------------------------------------------------
4.实现两个单词查询的交集
1
2
3
4
5
6
7
8
9
10
|
void
main()
{ ifstream in;
TextQuery file = in;
Query q = Query( "were" ) & Query( "was" );
const
auto
results = q.eval(file);
cout << "Executing Query for: "
<< q << endl;
print(cout, results) << endl;
} |
从 Query q = Query("were") & Query("was"); 说起。
调用重载的 | 运算符,返回一个 return shared_ptr<Query_base>(new AndQuery(lhs, rhs)); 智能指针。
对象 q 指向了AndQuery(lhs, rhs)的地址。
当const auto results = q.eval(file); q 调用 Query的eval()函数,指向到了AndQuery 的eval()函数上。
QueryResult AndQuery::eval(const TextQuery& text) const
{
auto left = lhs.eval(text), right = rhs.eval(text); //调用的是WordQuery的eval
auto ret_lines = std::make_shared<std::set<int>>();
set_intersection(left.begin(), left.end(), right.begin(), right.end(),
inserter(*ret_lines, ret_lines->begin()));
return QueryResult(rep(), ret_lines, left.get_file());
}
第一行(函数体内),用AndQuery 的成员变量,建立了right,left。(都是基于对单个单词的调用。)返回类型都是QueryResult。
第二行,建立了一个ret_lines 的智能指针,指向了一个建立了的set<int> ,为空指针。
第三行,用 stl (标准库算法) set_intersection。返回了一个right ,left 内容的交集。
第四行,返回QueryResult 对象。(其中有个 left.get_file()参数,其实换成right.get_file()也可以,因为这就是传一个get_file()函数返回的对象)
剩下的就跟原来一样了。
-----------------------------------------------------------------------------------------------------------
5.实现复合查询
复合查询,运用的是运算符处理规则的特性(未完待续……)
-----------------------------------------------------------------------------------------------------------