假设我有一个表单的表达式.我知道我可以这样简化表达式:.但是,sympy.simplify和sympy.factor都返回原始表达式.
为了解决这个问题,我一直在低级别的表达式上运行:
factor_map = defaultdict(set)
additive_terms = expr.as_coeff_add()[-1]
for term1, term2 in combinations(additive_terms, 2):
common_terms = (
set(term1.as_coeff_mul()[-1])
& set(term2.as_coeff_mul()[-1])
)
if common_terms:
common_factor = sympy.Mul(*common_terms)
factor_map[common_factor] |= {term1, term2}
factor_map现在看起来像这样:
{
a: {a⋅x, -a⋅y},
b: {b⋅x, -b⋅y},
c: {-c⋅x, c⋅y},
x: {a⋅x, b⋅x, -c⋅x},
y: {-a⋅y, -b⋅y, c⋅y}
}
我按照术语表示的操作次数对其进行排序:
factor_list = sorted(
factor_map.items(),
key = lambda i: (i[0].count_ops() + 1) * len(i[1])
)[::-1]
然后我重建表达式:
used = set()
new_expr = 0
for item in factor_list:
factor = item[0]
appearances = item[-1]
terms = 0
for instance in appearances:
if instance not in used:
terms += instance.as_coefficient(factor)
used.add(instance)
new_expr += factor * terms
for term in set(additive_terms) - used:
new_expr += term
这给出了new_expr = d x *(a b – c)y *( – a – b c).不是很好,但更好.
我还可以通过将每个附加项的组合相互划分,检查结果是否为数字,并使用该信息进一步将输出减少到new_expr = d(x-y)*(a b -c)来改进.
我也尝试将sympy.factor应用于每个可能的附加项组合,但很明显,对于任何合理的大表达,它都会很快爆发.
编辑:这是一个在附加项集合的所有分区上使用sympy.factor的实现(从this answer借来的分区函数):
def partition(collection):
if len(collection) == 1:
yield [ collection ]
return
first = collection[0]
for smaller in partition(collection[1:]):
# insert `first` in each of the subpartition's subsets
for n, subset in enumerate(smaller):
yield smaller[:n] + [[ first ] + subset] + smaller[n+1:]
# put `first` in its own subset
yield [ [ first ] ] + smaller
def partial_factor(expr):
args = list(expr.as_coeff_add()[-1])
# Groupings is just a cache to avoid duplicating factor operations
groupings = {}
unique = set()
for p in partition(args):
new_expr = 0
for group in p:
add_group = sympy.Add(*group)
new_expr += groupings.setdefault(add_group, sympy.factor(add_group))
unique.add(new_expr)
return sorted(unique, key=sympy.count_ops)[0]
对于像* x b * y c * z d e * x f * y h * z这样的表达式,在我的计算机上运行需要7.8秒,而另一种方法需要378微秒并且给出相同的结果.似乎应该有一种方法比第一种方法更严格,而不需要花费20,000倍的时间来解决它.
我觉得应该不是很难得到我想要的东西.有更简单的方法吗?
解决方法:
很难提出一种大部分时间都能起作用的“部分保理”策略.这是一个值得尝试的事情,设计时考虑了你的例子(多个变量的多项式).
给出一个表达式:尝试将其考虑在内.如果不成功,请查看它包含的每个符号的系数;方法Expr.coeff(符号)就是这样做的.具有最小系数的符号(通过包含的符号数量来测量)被认为是分解的障碍并且从表达式中移除.重复.
该逻辑在下面编码,并且partial_factor(a * x b * x-c * x-a * y -b * y c * y d)确实返回d(x-y)*(a b -c).
def partial_factor(expr):
to_factor = expr
non_factorable = 0
while to_factor != 0:
if factor(to_factor) != to_factor:
return factor(to_factor) + non_factorable
coeffs = {v: to_factor.coeff(v) for v in to_factor.free_symbols}
min_size = min([len(coeffs[v].free_symbols) for v in coeffs])
for v in coeffs:
if len(coeffs[v].free_symbols) == min_size:
non_factorable += v*coeffs[v]
to_factor -= v*coeffs[v]
break
return expr